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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Barenkamp, Stephen J 

(ii) TITLE OF INVENTION: High Molecular Weight Surface Proteins 
of Non-Typeable Haemophilus 

(iii) NUMBER OF SEQUENCES: 11 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Shoemaker and Mattare, Ltd. 

(B) STREET: .2001 Jefferson Davis Hwy. , 1203 Crystal Plaza 

Bldg. 1 

(C) CITY: Arlington 

(D) STATE: Virginia 

(E) COUNTRY: U.S.A. 

(F) ZIP: 22202-0286 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/617,697 

(B) FILING DATE: Ol-APR-1996 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/302,832 

(B) FILING DATE: 05-OCT-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US PCT/US93 /02166 

(B) FILING DATE: 16-MAR-1993 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Berkstresser , Jerry W 

(B) REGISTRATION NUMBER: 22,651 

(C) REFERENCE/DOCKET NUMBER: 1038-557 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (703) 415-0810 

(B) TELEFAX: (703) 415-0813 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5116 base pairs 
(E) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID.N0:1: 
ACAGCGTTCT CTTAATACTA GTACAAACCC ACAATAAAAT ATGACAAACA ACAATTACAA 
CACCTTTTTT GCAGTCTATA TGCAAATATT TTAAAAAATA GTATAAATCC GCCATATAAA 
ATGGTATAAT CTTTCATCTT TCATCTTTCA TCTTTCATCT TTCATCTTTC ATCTTTCATC 



60 
120 
180 



TTTCATCTTT 
ACATGCCCTG 
AACGCAAATG 
TATATCGTCT 
GGGGTTGTGA 
ACTTAGCGTT 
AATCTGTTTT 
AAGTAGATGG 
AATTTAACAT 
TATTCAACCG 
GACAAGTCTT 
CTAATGGCTT 
TCACCTTCGA 
CTGTCGGTAA 
TTAGCGTAAA 
TAATAAACCC 
GCGATATTTT 
GTAAACTTTC 
AAGAGGGTGA 
• GCAAGCTGAT 
CAGGTAAAGA 
GCATTCAATT 
AAGAAAAAGG 
ACGCTCAAGG 
ATTTATTCAT 
ATGtATCTAT 
CGGGATCCGG 
ACACAACTCT 
GCATCTATGT 
GTCGGAGCGG 
GTGCAAACTT 
GGGCGCAAGG 
ACCAAGTCAT 
ATAATGTCTC 



CATCTTTCAT 
ATGAACCGAG 
ATAAAGTAAT 
CAAATTCAGC 
CCATTCCACA 
AAAGCCACTT 
AGCAAGCGGC 
TAATAAAACC 
CGACCAAAAT 
TGTTACATCT 
TTTAATCAAC 
TACGGCTTCT 
GCAAACCAAA 
AGACGGCAGT 
TGGTGGCAGC 
AACCATTACT 
TGCCAAAGGC 
TGCTGATTCT 
AGCGGAAATT 
GATTACAGGC 
AGGGGGAGAA 
AGCAAAGAAA 
CGGACGCGCT 
TAGTGGTGAT 
CAAAGACAAT 
TAATGCAGAA 
GAATAGTGCC 
TGAGAGTATA 
CAATAGCTCC 
TGGCGGCGTT 
AACAATTTAC 
TAACATAAAC 
TACAGGTCAA 
TCTAAACGGC 



CTTTCATCTT 
GGAAGGGAGG 
TTAATTGTTC 
AAACGCCTGA 
GAAAAAGGCA 
TCCGCTATGT 
TTACAAGGAA 
ATTATCCGCA 
GAAATGGTGC 
AACCAAATCT 
CCAAATGGTA 
ACGCTAGACA 
GATAAAGCGC 
GTAAATCTTA 
ATTTCTTTAC 
TACAGCATTG 
GGTAACATTA 
GTAAGCAAAG 
GGCGGTGTAA 
GATAAAGTCA 
ACTTACCTTG 
ACCTCTTTAG 
ATTGTGTGGG 
ATCGCTAAAA 
QCAATTGTTG 
ACAGCAGGAC 
AGCACCCCAA 
CTAAAAAAAG 
ATTAATTTAT 
GAGATTAACA 
TCAGGCX3GCT 
ATTACAGCTA 
GGGACTATTA 
ACTGGCAGCX5 
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TCATCTTTCA 
GAGGGGCAAG 
AACTAACCTT 
ATGCTTTGGT 
GCGAAAAACC 
TACTATCTTT 
TGGATGTAGT 
ACAGTGTTGA 
AGTTTTTACA 
CCCAATTAAA 
TCACAATAGG 
TTTCTAACGA 
TCGCTGAAAT 
TTGGTGGCAA 
TCGCAGGGCA 
CCGCGCCTGA 
ATGTCCGTGC 
ATAAAAGCGG 
TTTCCGCTCA 
CATTAAAAAC 
GCGGTGACGA 
AAAAAGGCTC 
GCGATATTGC 
CCGGTGGTTT 
ACGCCAAAGA 
GCAGCAATAC 
AACGAAACAA 
GTACCTTTGT 
CCAATGGCAG 
ACGATATTAC 
GGGTTGATGT 
AACAAGATAT 
CCTCAGGCAA 
GACTGCAATT 



TCTTTCATCT 
AATGAAGAGG 
AGGAGAAAAT 
TGCTGTGTCT 
TGCTCGCATG 
AGGTGTAACA 
ACACGGCACA 
CGATATCATT 
AGAAAACAAC 
AGGGATTTTA 
TAAAGACGCA 
AAACATCAAG 
TGTGAATCAC 
AGTGAAAAAC 
AAAAATCACC 
AAATGAAGCG 
TGCCACTATT 
CAATATTGTT 
AAATCAGCAA 
AGGTGCAGTT 
GCGCGGCGAA 
AACCATCAAT 
GTTAATTGAC 
TGTGGAGACG 
GTGGTTGTTA 
TTCAGAAGAC 
AGAAAAGACA 
TAACATCACT 
CTTAACTCTT 
CACCGGTGAT 
TCATAAAAAT 
CGCCTTTGAG 
TCAAAAAGGT 
CACCACTAAA 



TTCATCTTTC 
GAGCTGAACG 
ATGAACAAGC. 
GAATTGGCAC 
AAAGTGCGTC 
TCTATTCCAC 
GCCACTATGC 
AATTGGAAAC 
AACTCCGCCG 
GATTCTAACG 
ATTATTAACA 
GCGCGTAATT 
GGTTTAATTA 
GAGGGTGTGA 
ATCAGCGATA 
GTCAATCTGG 
CGAAACCAAG 
CTTTCCGCCA 
GCTAAAGGCG 
ATCGACCTTT 
GGtAAAAAGG 
GTATCAGGCA . 
GGCAATATTA 
TCGGGGCATG 
GACCCGGATA 
GATGAATACA 
ACATTAACAA 
GCTAATCAAC 
TGGAGTGAGG 
GATACCAGAG 
ATCTCACTCG 
AAAGGAAGCA 
TTTAGATTTA 
AGAACCAATA 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 



AATACGCTAT CACAAATAAA TTTGAAGGGA CTTTAAATAT 
CAATGGTTTT ACCTAAAAAT GAAAGTGGAT ATGATAAATT 
ATTTAACCTC CTTAAATGTT TCCGAGAGTG GCGAGTTTAA 
GAAGCGATAG TGCAGGCACA CTTACCCAGC CTTATAATTT 
AAGACACTAC CTTTAATGTT GAACGAAATG CAAGAGTCAA 
TAGGGATAAA TAAGTATTCT AGTTTGAATT ACGCATCATT 
CGGGAGGGGG GAGTGTTGAT TTCACACTTC TCGCCTCATC 
GTGTAGTTAT AAATTCTAAA TACTTTAATG TTTCAACAGG 
CTTCAGGCTC AACAAAAACT GGCTTCTCAA TAGAGAAAGA 
GAGGCAACAT AACACTTTTG CAAGTTGAAG GCACCGATGG 
TAGCCAAAAA AAACATAACC TTTGAAGGAG GTAACATCAC 
TAACAGAAAT CGAAGGCAAT GTTACTATCA ATAACAACGC 
CGGATTTTGA CAACCATCAA AAACCTTTAA CTATTAAAAA 
GCAACCTTAC CGCTGGAGGC AATATTGTCA ATATAGCCGG 
ACGCTAATTT CAAAGCTATC ACAAATTTCA CTTTTAATGT 
AAGGCAATTC AAATATTTCC ATTGCCAAAG GAGGGGCTCG 
CCAAGAATTT AAGCATCACC ACCAACTCCA GCTCCACTTA 
ATATAACCAA TAAAAACGGT GATTTAAATA TTACGAACGA 
AAATTGGCGG CGATGTCTCG CAAAAAGAAG GTAATCTCAC 
ATATTACCAA ACAGATAACA ATCAAGGCAG GTGTTGATGG 
CGACAAACAA TGCCAATCTA ACCATTAAAA CCAAAGAATT 
ATATTTCAGG TTTCAATAAA GCAGAGATTA CAGCTAAAGA 
GTAACACCAA TAGTGCTGAT GGTACTAATG CCAAAAAAGT 
ATTCAAAAAT CTCTGCTGAC GGTCACAAGG TGACACTACA 
GTAGTAATAA CAACACTGAA GATAGCAGTG ACAATAATGC 
AAAATGTAAC AGTAAACAAC AATATTACTT CTCACAAAGC 
GTGGAGAAAT TACCACTAAA ACAGGTACAA CCATTAACGC 
TAACCGCTCA AACAGGTAGT ATCCTAGGTG GAATTGAGTC 
TTACTGCAAC CGAGGGCGCT CTTGCTGTAA GCAATATTTC 
CTGCAAATAG CGGTGCATTA ACCACTTTGG CAGGCTCTAC 
TAACCACTTC AAGTCAATCA GGCGATATCG GCGGTACGAT 
TTAAAGCAAC CGAAAGTTTA ACCACTCAAT CCAATTCAAA 
AGGCTAACGT AACAAGTGCA ACAGGTACAA TTGGTGGTAC 
ATGTTACGGC AAACGCTGGC GATTTAACAG TTGGGAATGG 



TTCAGGGAAA 
CAAAGGACGC 
CCTCACTATT 
AAACGGTATA 
CTTTGACATC 
TAATGGAAAC 
CTCTAACGTC 
GTCAAGTTTA 
TTTAACTTTA 
AATGATTGGT 
CTTTGGCTCC 
TAACGTCACT 
AGATGTCATC 
AAATCTTACC 
AGGCGGCTTG 
CTTTAAAGAC 
CCGCACTATT 
AGGTAGTGAT 
GATTTCTTCT 
GGAGAATTCC 
GAAATTAACG 
TGGTAGTGAT 
AACCTTTAAC 
CAGCAAAGTG 
CGGCTTAACT 
AGTGAGCATC 
AACCACTGGT 
CAGCTCTGGC 
GGGCAACACC 
AATTAAAGGA 
TTCTGGTGGC 
AATTAAAGCA 
GATTTCCGGT 
CGCAGAAATT 



GTGAACATCT 
ACTTACTGGA 
GACTCCAGAG ■ 
TCATTCAACA 
AAGGCACCAA 
ATTTCAGTTT 
CAAACCCCCG 
AGATTTAAAA 
AATGCCACCG 
AAAGGCATTG 
AGGAAAGCCG 
CTTATCGGTT 
ATTAATAGCG 
GTTGAAAGTA 
TTTGACAACA 
ATTGATAATT 
ATAAGCGGCA 
ACTGAAATGC 
GACAAAATCA 
GATTCAGACG 
CAAGACCTAA 
TTAACTATTG - 
CAGGTTAAAG 
GAAACATCCG 
ATCGATGCAA 
TCTGCGACAA 
AACGTGGAGA 
TCTGTAACAC 
GTTACTGTTA 
ACCGAGAGTG 
ACAGTAGAGG 
ACAACAGGCG 
AATACGGTAA 
AATGCGACAG 



2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 
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AAGGAGCTGC AACCTTAACT ACATCATCGG GCAAATTAAC TACC3AAGCT AGTTCACACA 4 3 20 

TTACTTCAGC CAAGGGTCAG GTAAATCTTT CAGCTCAGGA TGGTAGCGTT GCAGGAAGTA 4 3 80 

TTAATGCCGC CAATGTGACA CTAAATACTA CAGGCACTTT AACTACCGTG AAGGGTTCA;^. 444 0 

ACATTAATGC AACCAGCGGT ACCTTGGTTA TTAACGCAAA AGACGCTGAG CTAAATGGCG 4 500 

CAGCATTGGG TAACCACACA GTGGTAAATG CAACCAACGC AAATGGCTCC GGCAGCGTAA 4 560 

TCGCGACAAC CTCAAGCAGA GTGAACATCA CTGGGGATTT AATCACAATA AATGGATTAA 462 0 

ATATCATTTC AAAAAACGGT ATAAACACCG TACTGTTAAA AGGCGTTAAA ATTGATGTGA 4680 

AATACATTCA ACCGGGTATA GCAAGCGTAG ATGAAGTAAT TGAAGCGAAA CGCATCCTTG 4 74 0 

AGAAGGTAAA AGATTTATCT GATGAAGAAA GAGAAGCGTT AGCTAAACTT GGAGTAAGTG 4800 

CTGTACGTTT TATTGAGCCA AATAATACAA TTACAGTCGA TACACAAAAT GAATTTGCAA 4860 

CCAGACCATT AAGTCGAATA GTGATTTCTG AAGGCAGGGC GTGTTTCTCA AACAGTGATG 4 920 

GCGCGACGGT GTGCGTTAAT ATCGCTGATA ACGGGCGGTA GCGGTCAGTA ATTGACAAGG 4980 

TAGATTTCAT CCTGCAATGA AGTCATTTTA TTTTCGTATT ATTTACTGTG TGGGTTAAAG 5040 

TTCAGTACGG GCTTTACCCA TCTTGTAAAA AATTACGGAG AATACAATAA AGTATTTTTA 5100 

ACAGGTTATT ATTATG SUg 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asn Lys lie Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leii 
15 10 15 

Val Ala Val Ser Glu Leu Ala Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

Gly Ser Glu Lys Pro Ala Arg Met Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala Met Leu Leu Ser Leu Gly Val Thr Ser He Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Asp Val Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr He He Arg Asn Ser Val 
85 90 95 

Asp Ala He He Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 110 

Val Gin Phe Leu Gin Glu Asn Asn Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 



Thr Ser Asn Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 

Gin val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 

155 1-60 
He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 

Glu Asn He Lys Ala Arg Asn Phe Thr Phe Glu Gin Thr Lys Asp Lys 
180 185 190 ^ 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
1=*^ 200 205 

Gly ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 

215 220 
ser val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
230 235 240 

He ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 

250 255 
Glu Asn Glu Ala Val Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 

265 270 
He Asn Val Arg Ala Ala Thr He Arg Asn Gin Gly Lys Leu Ser Ala 
280 285 

Asp ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
310 315 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 
" 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
360 365 

Lys Lys Thr Ser Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
375 380 

Glu Lys Gly Gly Arg Ala He Val Trp Gly Asp He Ala Leu He Asp 

395 400 

Gly Asn He Asn Ala Gin Gly Ser Gly Asp He Ala Lys Thr Gly Gly 
405 ^ 

Phe Val Glu Thr Ser Gly His Asp Leu Phe He Lys Asp Asn Ala lie 
-^^O 425 

Val Asp Ala Lys Glu Trp Leu Leu Asp Phe Asp Asn Val Ser He Asn 
440 445 

Ala Glu Thr Ala Gly Arg Ser Asn Thr Ser Glu Asp Asp Glu Tyr Thr 
455 460 

Gly Ser Gly Asn Ser Ala Ser Thr Pro Lys Arg Asn Lys Glu Lys Thr 

475 480 



Thr Leu Thr Asn Thr Thr Leu Glu Ser He Leu Lys Lys Gly Thr Phe 
185 490 

Val Asn lie Thr Ala Asn Gin Arg lie Tyr Val Asn Ser Ser He Asn 

500 505 510 

Leu Ser Asn Gly Ser Leu Thr Leu Trp Ser Glu Gly Arg Ser Gly Glv 
= 520 525 

Gly val Glu He Asn Asn Asp He Thr Thr Gly Asp Asp Thr Arg Gly 
=^0 535 540 

Ala Asn Leu Thr He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn 
550 555 

He Ser Leu Gly Ala Gin Gly Asn He Asn He Thr Ala Lys Gin Asp 
565 570 

He Ala Phe Glu Lys Gly Ser Asn Gin Val He Thr Gly Gin Glv Thr 
580 . 585 590 

He Thr Ser Gly Asn Gin Lys Gly Phe Arg Phe Asn Asn Val Ser Leu 
595 600 605 

Asn Gly Thr Gly Ser Gly Leu Gin Phe Thr Thr Lys Arg Thr Asn Lys 
610 615 620 



Tyr Ala He Thr Asn Lys Phe Glu Gly Thr Leu Asn He Ser Glv 
625 (;-in ^-.r- ■' 



Lys 



Val Asn He Ser Met Val Leu Pro Lys Asn Glu Ser Gly Tyr Asp Lvs 
^45 650 655 

Phe Lys Gly Arg Thr Tyr Trp Asn Leu Thr Ser Leu Asn Val Ser Glu 

665 670 

Ser Gly Glu Phe Asn Leu Thr He Asp Ser Arg Gly Ser Asp Ser Ala 
675 680 685 

Gly Thr Leu Thr Gin Pro Tyr Asn Leu Asn Gly He Ser Phe Asn Lys 
"90 695 700 

Asp Thr Thr Phe Asn Val Glu Arg Asn Ala Arg Val Asn Phe Asp lie 
710 715 720 

Lys Ala Pro He Gly He Asn Lys Tyr Ser Ser Leu Asn Tyr Ala Ser 
725 730 735 

Phe Asn Gly Asn He Ser Val Ser Gly Gly Gly Ser Val Asp Phe Thr 
740 745 

Leu Leu Ala Ser Ser Ser Asn Val Gin Thr Pro Gly Val Val He Asn 



760 765 



Ser Lys Tyr Phe Asn Val Ser Thr Gly Ser Ser Leu Arg Phe Lys Thr 
770 775 

Ser Gly Ser Thr Lys Thr Gly Phe Ser He Glu Lys Asp Leu Thr Leu 

790 795 800 

Asn .Ala Thr Gly Gly Asn He Thr Leu Leu Gin Val Glu Gly Thr Asn 

805 810 815 

Gly Met He Gly Lys Gly He Val Ala Lys Lys Asn He Thr Phe Glu 



820 



825 830 



Gly Gly Asn lie Thr Phe Gly Ser Arg Lys Ala Val Thr Glu lie Glu 
835 840 845 

Gly Asn Val Thr lie Asn Asn Asn Ala Asn Val Thr Leu He Gly Ser 
850 855 860 

Asp Phe Asp Asn His Gin Lys Pro Leu Thr He Lys Lys Asp Val He 
865 870 875 880 

He Asn Ser Gly Asn Leu Thr Ala Gly Gly Asn He Val Asn He Ala 
885 890 895 

Gly Asn Leu Thr Val Glu Ser Asn Ala Asn Phe Lys Ala He Thr Asn 
900 905 910 

Phe Thr Phe Asn Val Gly Gly Leu Phe Asp Asn Lys Gly Asn Ser Asn 
915 920 925 

He Ser He Ala Lys Gly Gly Ala Arg Phe Lys Asp He Asp Asn Ser 
930 935 940 

Lys Asn Leu Ser He Thr Thr Asn Ser Ser Ser Thr Tyr Arg Thr He 
945 950 955 960 

He Ser Gly Asn He Thr Asn Lys Asn Gly Asp Leu Asn He Thr Asn 
965 970 975 

Glu Gly Ser Asp Thr Glu Met Gin He Gly Gly Asp Val Ser Gin Lys 
980 985 990 

Glu Gly Asn Leu Thr He Ser Ser Asp Lys He Asn He Thr Lys Gin 
995 1000 1005 

He Thr He Lys Ala Gly Val Asp Gly Glu Asn Ser Asp Ser Asp Ala 
1010 1015 1020 

Thr Asn Asn Ala Asn Leu Thr He Lys Thr Lys Glu Leu Lys Leu Thr 
1025 1030 1035 1040 

Gin Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr Ala Lys 
1045 1050 ' . 1055 

Asp Gly Ser Asp Leu Thr He Gly Asn Thr Asn Ser Ala Asp Gly Thr 
1060 1065 1070 

Asn Ala Lys Lys Val Thr Phe Asn Gin Val Lys Asp Ser Lys He Ser 
1075 1080 1085 

Ala Asp Gly His Lys Val Thr Leu His Ser Lys Val Glu Thr Ser Gly 
1090 1095 1100 

Ser Asn Asn Asn Thr Glu Asp Ser Ser Asp Asn Asn Ala Gly Leu Thr 
1105 1110 1115 1120 

He Asp Ala Lys Asn Val Thr Val Asn Asn Asn He Thr Ser His Lys 
1125 1130 1135 

Ala Val Ser He Ser Ala Thr Ser Gly Glu He Thr Thr Lys Thr Gly 
1140 1145 1150 

Thr Thr He Asn Ala Thr Thr Gly Asn Val Glu He Thr Ala Gin Thr 
IISS 1160 1165 

Gly Ser He Leu Gly Gly He Glu Ser Ser Ser Gly Ser Val Thr Leu 
1170 1175 1180 



j-Ci-osiii"" n n P .... 



Thr Ala Thr Glu Gly Ala Leu Ala Val Ser Asn He Ser Gly Asn Thr 
1190 1195 ^200 

val Thr Val Thr Ala Asn Ser Gly Ala Leu Thr Thr Leu Ala Gly. Ser 
1205 ■\^)^ n . _: _ 



1215 



Thr lie Lys Gly Thr Glu Ser Val Thr Thr Ser Ser Gin Ser Gly 
1220 1225 ^ 



1230 



He Gly Gly Thr lie Ser Gly Gly Thr Val Glu Val Lys Ala Thr Glu 
1235 1240 1245 

ser Leu Thr Thr Gin Ser Asn Ser Lys He Lys Ala Thr Thr Gly Glu 
1^^" 1255 1260 

Ala^Asn Val Thr Ser Ala Thr Gly Thr He Gly Gly Thr He Ser Gly 
^^^^ 1275 1280 

Asn Thr Val Asn Val Thr Ala Asn Ala Gly Asp Leu Thr Val Gly Asn 
1285 1290 1295 

Gly Ala Glu lie Asn Ala Thr Glu Gly Ala Ala Thr Leu Thr Thr Ser 
1300 iTnt; 



1310 



ser Gly Lys Leu Thr Thr Glu Ala Ser Ser His He Thr Ser Ala Lys 
1315 1320 1325 ^ 

Gly Gln^Val Asn Leu Ser Ala^Gln Asp Gly Ser Val Ala Gly Ser He 



Asn Ala Ala Asn Val Thr Leu Asn Thr Thr Oly Thr Leu Thr Thr Val 
1350 1355 ^3gQ 

Lys Gly Ser Asn lie Asn Ala Thr Ser Gly Thr Leu Val He Asn Ala 
1365 1370 ^375 

Lys Asp Ala Glu Leu Asn Gly Ala Ala Leu Gly Asn His Thr Val Val 
1380 1385 1390 

Asn Ala Thr Asn Ala Asn Gly Ser Gly Ser Val He Ala Thr Thr Ser 
^•i^^ 1400 1405 

lllo''^'' ''^ ■'''^ J^^c^^'P ^^"^ "^^^ Leu Asn 

^^^^ 1415 1420 

Ile^He ser Lys Asn Gly He Asn Thr Val Leu Leu Lys Gly Val Lys 
1*30 1435 ^^^Q 

He Asp Val Lys T^r He Gin Pro Gly He Ala Ser Val Asp Glu Val 
1445 1450 3^455 

He Glu Ala Lys Arg He Leu Glu Lys Val Lys Asp Leu Ser Asp Glu 
1460 1465 1470 

Glu Arg Glu Ala Leu Ala Lys Leu Gly Val Ser Ala Val Arg Phe He 
14^5 1480 1485 

^l^n"^"" '^^ Asn Glu Phe Ala Thr 

1495 1500 

Arg Pro Leu Ser Arg He Val He Ser Glu Gly Arg Ala Cys Phe Ser 

1515 1520 
Asn Ser Asp Gly Ala Thr Val Cys Val Asn He Ala Asp Asn Gly Arg 
^S2S 1530 isSs 
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(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4937 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

TAAATATACA AGATAATAAA AATAAATCAA GATTTTTGTG ATGACAAACA ACAATTACAA 60 

CACCTTTTTT GCAGTCTATA TGCAAATATT TTAAAAAAAT AGTATAAATC CGCCATATAA 12 0 

AATGGTATAA TCTTTCATCT TTCATCTTTA ATCTTTCATC TTTCATCTTT CATCTTTCAT 180 

CTTTCATCTT TCATCTTTCA TCTTTCATCT TTCATCTTTC ATCTTTCATC TTTCATCTTT 24 0 

CACATGAAAT GATGAACCGA GGGAAGGGAG GGAGGGGCAA GAATGAAGAG GGAGCTGAAC 300 

GAACGCAAAT GATAAAGTAA TTTAATTGTT CAACTAACCT TAGGAGAAAA TATGAACAAG 360 

ATATATCGTC TCAAATTCAG CAAACGCCTG AATGCTTTGG TTGCTGTGTC TGAATTGGCA 420 

CGGGGTTGTG ACCATTCCAC AGAAAAAGGC TTCCGCTATG TTACTATCTT TAGGTGTAAC 48 0 

CACTTAGCGT TAAAGCCACT TTCCGCTATG TTACTATCTT TAGGTGTAAC ATCTATTCCA 54 0 

CAATCTGTTT TAGCAAGCGG CTTACAAGGA ATGGATGTAG TACACGGCAC AGCCACTATG 600 

CAAGTAGATG GTAATAAAAC CATTATCCGC AACAGTGTTG ACGCTATCAT TAATTGGAAA 660 

CAATTTAACA TCGACCAAAA TGAAATGGTG CAGTTTTTAC AAGAAAACAA CAACTCCGCC 72 0 

GTATTCAACC GTGTTACATC TAACCAAATC TCCCAATTAA AAGGGATTTT AGATTCTAAC 78 0 

GGACAAGTCT TTTTAATCAA CCCAAATGGT ATCACAATAG GTAAAGACGC AATTATTAAC 84 0 

ACTAATGGCT TTACGGCTTC TACGCTAGAC ATTTCTAACG AAAACATCAA GGCGCGTAAT 900 

TTCACCTTCG AGCAAACCAA AGATAAAGCG CTCGCTCAAA TTGTGAATCA CGGTTTAATT - 960 

ACTGTCGGTA AAGACGGCAG TGTAAATCTT ATTGGTGGCA AAGTGAAAAA CGAGGGTGTG 1020 

ATTAGCGTAA ATGGTGGCAG CATTTCTTTA CTCGCAGGGC AAAAAATCAC CATCAGCGAT 1080 

ATAATAAACC CAACCATTAC TTACAGCATT GCCGCGCCTG AAAATGAAGC GGTCAATCTG 1140 

GGCGATATTT TTGCCAAAGG CGGTAACATT AATGTCCGTG CTGCCACTAT TCGAAACCAA 1200 

GGTAAACTTT CTGCTGATTC TGTAAGCAAA GATAAAAGCG GCAATATTGT TCTTTCCGCC 1260 

AAAGAGGGTG AAGCGGAAAT TGGCGGTGTA ATTTCCGCTC AAAATCAGCA AGCTAAAGGC 1320 

GGCAAGCTGA TGATTACAGG CGATAAAGTC ACATTAAAAA CAGGTGCAGT TATCGACCTT 13 80 

TCAGGTAAAG AAGGGGGAGA AACTTACCTT GGCGGTGACG AGCGCGGCGA AGGTAAAAAC 1440 

GGCATTCAAT TAGCAAAGAA AACCTCTTTA GAAAAAGGCT CAACCATCAA TGTATCAGGC 1500 

AAAGAAAAAG GCGGACGCGC TATTGTGTGG GGCGATATTG CGTTAATTGA CGGCAATATT 1560 

AACGCTCAAG GTAGTGGTGA TATCGCTAAA ACCGGTGGTT TTGK3GAGAC ATCGGGGCAT 1620 

TATTTATCCA TTGACAGCAA TGCAATTGTT AAAACAAAAG AGTGGTTGCT AGACCCTGAT 1680 



72 

GATGTAACAA TTGAAGCCGA AGACCCCCTT CGCAATAATA CCGGTATAAA TGATGAATTC 174 0 

CCAACAGGCA CCGGTGAAGC AAGCGACCCT AAAAAAAATA GCGAACTCAA AACAACGCTA 1800 

ACCAATACAA CTATTTCAAA TTATCTGAAA AACGCCTGGA CAATGAATAT A-^CGGCATCA 1860 

AGAAAACTTA CCGTTAATAG CTCAATCAAC ATCGGAAGCA ACTCCCACTT AATTCTCCAT 1920 

AGTAAAGGTC AGCGTGGCGG AGGCGTTCAG ATTGATGGAG ATATTACTTC TAAAGGCGGA 1980 

AATTTAACCA TTTATTCTGG CGGATGGGTT GATGTTCATA AAAATATTAC GCTTGATCAG 2 04 0 

GGTTTTTTAJ^ ATATTACCGC CGCTTCCGTA GCTTTTGAAG GTGGAAATAA CAAAGCACGC 2100 

GACGCGGCAA ATGCTAAAAT TGTCGCCCAG GGCACTGTAA CCATTACAGG AGAGGGAAAA 2160 

GATTTCAGGG CTAACAACGT ATCTTTAAAC GGAACGGGTA AAGGTCTGAA TATCATTTCA 2220 

TCAGTGAATA ATTTAACCCA CAATCTTAGT GGCACAATTA ACATATCTGG GAATATAACA 2280 

ATTAACCAAA CTACGAGAAA GAACACCTCG TATTGGCAAA CCAGCCATGA TTCGCACTGG 2340 

AACGTCAGTG CTCTTAATCT AGAGACAGGC GCAAATTTTA CCTTTATTAA ATACATTTCA 2400 

AGCAATAGCA AAGGCTTAAC AACACAGTAT AGAAGCTCTG CAGGGGTGAA TTTTAACGGC 2460 

GTAAATGGCA ACATGTCATT CAATCTCAAA GAAGGAGCGA AAGTTAATTT CAAATTAAAA 2520 

CCAAACGAGA ACATGAACAC AAGCAAACCT TTACCAATTC GGTTTTTAGC CAATATCACA 2 580 

GCCACTGGTG GGGGCTCTGT TTTTTTTGAT ATATATGCCA ACCATTCTGG CAGAGGGGCT 2640 

GAGTTAAAAA TGAGTGAAAT TAATATCTCT AACGGCGCTA ATTTTACCTT AAATTCCCAT 2 700 

GTTCGCGGCG ATGACGCTTT TAAAATCAAC AAAGACTTAA CCATAAATGC AACCAATTCA 2 760 

AATTTCAGCC TCAGACAGAC GAAAGATGAT TTTTATGACG GGTACGCACG CAATGCCATC 282 0 

AATTCAACCT ACAACATATC CATTCTGGGC GGTAATGTCA CCCTTGGTGG ACAAAACTCA 2880 

AGCAGCAGCA TTACGGGGAA TATTACTATC GAGAAAGCAG CAAATGTTAC GCTAGAAGCC 2940 
AATAACGCCC CTAATCAGCA AAACATAAGG GATAGAGTTA TAAAACTTGG CAGCTTGCTC . 3000 

GTTAATGGGA GTTTAAGTTT AACTGGCGAA AATGCAGATA TTAAAGGCAA TCTCACTATT 3060 

TCAGAAAGCG CCACTTTTAA AGGAAAGACT AGAGATACCC TAAATATCAC CGGCAATTTT 3120 

ACCAATAATG GCACTGCCGA AATTAATATA ACACAAGGAG TGGTAAAACT TGGCAATGTT 3180 

ACCAATGATG GTGATTTAAA CATTACCACT CACGCTAAAC GCAACCAAAG AAGCATCATC 3240 

GGCGGAGATA TAATCAACAA AAAAGGAAGC TTAAATATTA CAGACAGTAA TAATGATGCT 3300 

GAAATCCAAA TTGGCGGCAA TATCTCGCAA AAAGAAGGCA ACCTCACGAT TTCTTCCGAT 3360 

AAAATTAATA TCACCAAACA GATAACAATC AAAAAGGGTA TTGATGGAGA GGACTCTAGT 3420 

TCAGATGCGA CAAGTAATGC CAACCTAACT ATTAAAACCA AAGAATTGAA ATTGACAGAA 3480 

GACCTAAGTA TTTCAGGTTT CAATAAAGCA GAGATTACAG CCAAAGATGG TAGAGATTTA 3540 

ACTATTGGCA ACAGTAATGA CGGTAACAGC GGTGCCGAAG CCAAAACAGT AACTTTTAAC 3600 

AATGTTAAAG ATTCAAAAAT CTCTGCTGAC GGTCACAATG TGACACTAAA TAGCAAAGTG 3660 

AAAACATCTA GCAGCAATGG CGGACGTGAA AGCAATAGCG ACAACGATAC CGGCTTAACT 3720 
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ATTACTGCAA AAAATGTAGA AGTAAACAAA GATATTACTT CTCTCAAAAC 
ACCGCGTCGG AAAAGGTTAC CACCACAGCA GGCTCGACCA TTAACGCAAC 
GCAAGTATTA CAACCAAAAC AGGTGATATC AGCGGTACGA TTTCCGGTAA 
GTTAGCGCGA CTGGTGATTT AACCACTAAA TCCGGCTCAA AAATTGAAGC 
GAGGCTAATG TAACAAGTGC AACAGGTACA ATTGGCGGTA CAATTTCCGG 
AATGTTACGG CAAACGCTGG CGATTTAACA GTTGGGAATG GCGCAGAAAT 
GAAGGAGCTG CAACCTTAAC CGCAACAGGG AATACCTTGA CTACTGAAGC 
ATCACTTCAA CTAAGGGTCA GGTAGACCTC TTGGCTCAGA ATGGTAGCAT 
ATTAATGCTG CTAATGTGAC ATTAAATACT ACAGGCACCT TAACCACCGT 
GATATTAAAG CAACCAGCGG CACCTTGGTT ATTAACGCAA AAGATGCTAA 
GATGCATCAG GTGATAGTAC AGAAGTGAAT GCAGTCAACG CAAGCGGCTC 
ACTGCGGCAA CCTCAAGCAG TGTGAATATC ACTGGGGATT TAAACACAGT 
AATATCATTT CGAAAGATGG TAGAAACACT GTGCGCTTAA GAGGCAAGGA 
AAATATATCC AGCCAGGTGT AGCAAGTGTA GAAGAAGTAA TTGAAGCGAA 
GAAAAAGTAA AAGATTTATC TGATGAAGAA AGAGAAACAT TAGCTAAACT 
GCTGTACGTT TTGTTGAGCC A^TAATACA ATTACAGTCA ATACACAAAA 
ACCAGACCGT CAAGTCAAGT GATAATTTCT GAAGGTAAGG CGTGTTTCTC 
GGCGCACGAG TATGTACCAA TGTTGCTGAC GATGGACAGC CGTAGTCAGT 
GTAGATTTCA TCCTGCAATG AAGTCATTTT ATTTTCGTAT TATTTACTGT 
GTTCAGTACG GGCTTTACCC ATCTTGTAAA AAATTACGGA GAATACAATA 
AACAGGTTAT TATTATG 
(2) INFORMATION FOR SEQ ID NO: 4; 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 1477 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY, linear 



AGTAAATATC 
AAATGGCAAA 
CACGGTAAGT ■ 
GAAATCGGGT 
TAATACGGTA 
TAATGCGACA 
CGGTTCTAGC 
CGCAGGAAGC 
GGCAGGCTCG 
GCTAAATGGT 
TGGTAGTGTG 
AAATGGGTTA 
AATTGAGGTG 
ACGCGTCCTT 
TGGTGTAAGT 
TGAATTTACA 
AAGTGGTAAT 
AATTGACAAG 
GTGGGTTAAA 
AAGTATTTTT 



3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4937 



(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Asn Lys He Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 

10 

Val Ma VaX s.r Glu ..u Ala Arg cly Cys Asp His Ser Thr olu Lys 

30 

Gly ser Glu I,ys Pro Ala Arg Met Lys Val Arg His Leu Ala Leu Lys 
40 45 

Pro Leu Ser Ala Met Leu Leu Ser Leu Gly Val Thr Ser He Pro Gin 
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Ser Val Leu Ala Ser Gly Leu Gin Gly Met Asp Val Val His Gly Thr 

''^ 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr- lie He Arg Asn Ser Val 

85 .90 95 

Asp Ala lie lie Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 

val Gin Phe Leu Gin Glu Asn Asn Asn Ser Ala Val Phe Asn Arg Val 
120 j^25 

Thr ser Asn Gin He Ser Gin Leu Lys Gly lie Leu Asp Ser Asn Gly 
135 140 

Gin Val Phe Leu He Asn Pro Asn Gly lie Thr He Gly Lys Asp Ala 
150 155 

lie lie Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp lie Ser Asn 
165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Phe Glu Gin Thr Lys Asp Lys 
180 185 190 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 

^■^•^ 215 220 

ser val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
230 235 240 

He ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 

245 250 255 

Glu Asn Glu Ala Val Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 

265 270 
He Asn Val Arg Ala Ala Thr He Arg Asn Gin Gly Lys Leu Ser Ala 
^ 280 285 ' . 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
" 310 315 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 

330 335 

Thr. Gly Ala Val He Asp Leu Ser Gly tys Glu Gly Gly Glu Thr Tyr 
345 350 ^ 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 

Lys Lys Thr Ser Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
375 380 

Glu Lys Gly Gly Phe Ala He Val Trp Gly Asp He Ala Leu He Asp 

Gly Asn He Asn Ala Gin Gly Ser Gly Asp He Ala Lys Thr Gly Gly 
40S 410 4JL5 



Phe val Glu Thr Ser Gly His Asp Leu Phe He Lys Asp Asn Ala lie 
420 425 430 

Val Asp Ala Lys Glu Trp Leu Leu Asp Phe Asp Asn Val Ser He Asn 
435 440 445 

Ala Glu Asp Pro Leu Phe Asn Asn Thr Gly He Asn Asp Glu Phe Pro 
450 455 

Thr Gly Thr Gly Glu Ala Ser Asp Pro Lys Lys Asn Ser Glu Leu Lys 
470 475 480 

Thr Thr Leu Thr Asn Thr Thr He Ser Asn Tyr Leu Lys Asn Ala Trp 
485 490 495 

Thr Met Asn He Thr Ala Ser Arg Lys Leu Thr Val Asn Ser Ser He 
SOS 510 

Asn He Gly Ser Asn Ser His Leu He Leu His Ser Lys Gly Gin Arcf 
520 525 

Gly Gly Gly Val Gin He Asp Gly Asp He Thr Ser Lys Gly Gly Asn 
535 

Leu Thr, He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn He Thr 
550 555 

Leu Asp Gin Gly Phe Leu Asn He Thr Ala Ala Ser Val Ala Phe Glu 
565 570 575 

Gly Gly Asn Asn Lys Ala Arg Asp Ala Ala Asn Ala Lys He Val Ala 
S60 585 590 

Gin Gly Thr Val Thr He Thr Gly Glu Gly Lys Asp Phe Arg Ala Asn 
595 600 605 

Asn Val Ser Leu Asn Gly Thr . Gly Lys Gly Leu Asn He He Ser Ser 
^^0 615 620 

val Asn Asn Leu Thr His Asn Leu Ser Gly Thr He Asn He Ser Gly 
"0 635 640 

Asn He Thr He Asn Gin Thr Thr Arg Lys Asn Thr Ser Tyr Trp Gin 
645 650 esi 

Thr Ser His Asp Ser His Trp Asn Val Ser Ala Leu Asn Leu Glu Thr 
665 670 

Gly Ala Asn Phe Thr Phe He Lys Tyr He Ser Ser Asn Ser Lys Gly 
680 685 

Leu Thr Thr Gin Tyr Arg Ser Ser Ala Gly Val Asn Phe Asn Gly Val 
^50 695 700 

Asn Gly Asn Met Ser Phe Asn Leu Lys Glu Gly Ala Lys Val Asn Phe 
'"^ ■'lO 715 720 

Lys Leu Lys Pro Asn Glu Asn Met Asn Thr Ser Lys Pro Leu Pro He 
■^25 730 735 

Arg Phe Leu Ala Asn lie Thr Ala Thr Gly Gly Gly Ser Val Phe Phe 
740 745 750 

Asp He Tyr Ala Asn His Ser Gly Arg Gly Ala Glu Leu Lys Met Ser 
''SS 760 765 



Glu lie Asn He Ser Asn Gly Ala Asn Phe Thr Leu Asn Ser His Val 
770 775 780 

Arg Gly Asp Asp Ala Phe Lys He Asn Lys Asp Leu Thr He Asn Ala 
785 790 795 800 

Thr Asn Ser Asn Phe Ser Leu Arg Gin Thr Lys Asp Asp Phe Tyr Asp 
805 810 815 

Gly Tyr Ala Arg Asn Ala He Asn Ser Thr Tyr Asn He Ser He Leu 
820 825 830 

Gly Gly Asn Val Thr Leu Gly Gly Gin Asn Ser Ser Ser Ser He Thr 
835 840 845 

Gly Asn He Thr He Glu Lys Ala Ala Asn Val Thr Leu Glu Ala Asn 
850 855 860 

Asn Ala Pro Asn Gin Gin Asn He Arg Asp Arg Val He Lys Leu Gly 
8S5 870 875 880 

Ser Leu Leu Val Asn Gly Ser Leu Ser Leu Thr Gly Glu Asn Ala Asp 
885 890 895 

He Lys Gly Asn Leu Thr He Ser Glu Ser Ala Thr Phe Lys Gly Lys 
900 905 910 

Thr Arg Asp Thr Leu Asn He Thr Gly Asn Phe Thr Asn Asn Gly Thr 
915 920 925 

Ala Glu He Asn He Thr Gin Gly Val Val Lys Leu Gly Asn Val Thr 
930 935 940 

Asn Asp Gly Asp Leu Asn He Thr Thr His Ala Lys Arg Asn Gin Arg 
945 950 955 960 

Ser He He Gly Gly Asp He He Asn Lys Lys Gly Ser Leu Asn He 
965 970 975 

Thr Asp Ser Asn Asn Asp Ala Glu He Gin He Gly Gly Asn He Ser 
980 985 990 

Gin Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys He Asn He Thr 
995 1000 1005 

Lys Gin He Thr He Lys Lys Gly He Asp Gly Glu Asp Ser Ser Ser 
1010 1015 1020 

Asp Ala Thr Ser Agn Ala Asn Leu Thr He Lys Thr Lys Glu Leu Lys 
1025 1030 1035 1040 

Leu Thr Glu Asp Leu Ser He Ser Gly Phe Asn Lys Ala Glu He Thr 
1045 1050 1055 

Ala Lys Asp Gly Arg Asp Leu Thr He Gly Asn Ser Asn Asp Gly Asn 
1060 . 1065 1070 

Ser Gly Ala Glu Ala Lys Thr Val Thr Phe Asn Asn Val Lys Asp Ser 
1075 1080 1085 

Lys He Ser Ala Asp Gly His Asn Val Thr Leu Asn Ser Lys Val Lys 
1090 1095 1100 

Thr Ser Ser Ser Asn Gly Gly Arg Glu Ser Asn Ser Asp Asn Asp Thr 
1105 1110 1115 1120 



Gly Leu Thr He Thr Ala Lys Asn Val Glu Val Asn Lys Asp He Thr 
1125 1130 1135 

Ser Leu Lys Thr Val Asn He Thr Ala Ser Glu Lys Val Thr Thr Thr 
1140 1145 USD 

Ala Gly Ser Thr He Asn Ala Thr Asn Gly Lys Ala Ser He Thr Thr 
1155 1160 1165 

Lys Thr Gly Asp He Ser Gly Thr He Ser Gly Asn Thr Val Ser Val 
1170 1175 1180 

Ser Ala Thr Val Asp Leu Thr Thr Lys Ser Gly Ser Lys He Glu Ala 
1185 1190 1195 1200 

Lys Ser Gly Glu Ala Asn Val Thr Ser Ala Thr Gly Thr He Gly Gly 
1205 1210 1215 

Thr He Ser Gly Asn Thr Val Asn Val Thr Ala Asn Ala Gly Asp Leu 
1220 1225 1230 

Thr Val Gly Asn Gly Ala Glu He Asn Ala Thr Glu Gly Ala Ala Thr 
1235 1240 1245 

Leu Thr Ala Thr Gly Asn Thr Leu Thr Thr Glu Ala Gly Ser Ser He 
1250 1255 1260 

Thr Ser Thr Lys Gly Gin Val Asp Leu Leu Ala Gin Asn Gly Ser He 
1265 1270 1275 1280 

Ala Gly Ser He Asn Ala Ala Asn Val Thr Leu Asn Thr Thr Gly Thr 
1285 1290 1295 

Leu Thr Thr Val Ala Gly Ser Asp He Lys Ala Thr Ser Gly Thr Leu 
1300 1305 1310 

Val He Asn Ala Lys Asp Ala Lys Leu Asn Gly Asp Ala Ser Gly Asp 
1315 1320 1325 

Ser Thr Glu Val Asn Ala Val Asn Ala Ser Gly Ser Gly Ser Val Thr 
1330 1335 1340 

Ala Ala Thr Ser Ser Ser Val Asn He Thr Gly Asp Leu Asn Thr Val 
1345 1350 1355 1360 

Asn Gly Leu Asn He He Ser Lys Asp Gly Arg Asn Thr Val Arg Leu 
1365 1370 1375 

Arg Gly Lys Glu lie Glu Val Lys Tyr He Gin Pro Gly Val Ala Ser 
1380 1385 1390 

Val Glu Glu Val He Glu Ala Lys Arg Val Leu Glu Lys Val Lys Asp 
1395 1400 1405 

Leu Ser Asp Glu Glu Arg Glu Thr Leu Ala Lys Leu Gly Val Ser Ala 
1410 1415 1420 

Val Arg Phe Val Glu Pro Asn Asn Thr He Thr Val Asn Thr Gin Asn 
1425 1430 1435 1440 

Glu Phe Thr Thr Arg Pro Ser Ser Gin Vdl He He Ser Glu Gly Lys 
1445 1450 1455 

Ala Cys Phe Ser Ser Gly Asn Gly Ala Arg Val Cys Thr Asn Val Ala 
1460 1465 1470 
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Asp Asp Gly Gin Pro 
1475 

(2) INFORMATION FOR SEQ ID N0:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ACAGCGTTCT CTTAATACTA GTACAAACCC ACAATAAAAT ATGACAAACA ACAATTACAA 

CACCTTTTTT GCAGTCTATA TGCAAATATT TTAAAAAATA GTATAAATCC GCCATATAAA 120 

ATGGTATAAT CTTTCATCTT TCATCTTTCA TCTTTCATCT TTCATCTTTC ATCTTTCATC 180 

TTTCATCTTT CATCTTTCAT CTTTCATCTT TCATCTTTCA TCTTTCATCT TTCATCTTTC 240 

ACATGAAATG ATGAACCGAG GGAAGGGAGG GAGGGGCAAG AATGAAGAGG GAGCTGAACG 300 

AACGCAAATG ATAAAGTAAT TTAATTGTTC AACTAACCTT AGGAGAAAAT ATGAACAAGA 360 

TATATCGTCT CAAATTCAGC AAACGCCTGA ATGCTTTGGT TGCTGTGTCT GAATTGGCAC 42 0 

GGGGTTGTGA CCATTCCACA GAAAAAGGCA GCGAAAAACC TGCTCGCATG AAAGTGCGTC 480 

ACTTAGCGTT AAAGCCACTT TCCGCTATGT TACTATCTTT AGGTGTAACA TCTATTCCAC 54 0 

AATCTGTTTT AGCAAGCGGC TTACAAGGAA TGGATGTAGT ACACGGCACA GCCACTATGC 600 

AAGTAGATGG TAATAAAACC ATTATCCGCA ACAGTGTTGA CGCTATCATT AATTGGAAAC 660 

AATTTAACAT CGACCAAAAT GAAATGGTGC AGTTTTTACA AGAAAACAAC AACTCCGCCG 720 

TATTCAACCG TGTTACATCT AACCAAATCT CCCAATTAAA AGGGATTTTA GATTCTAACG - 780 

GACAAGTCTT TTTAATCAAC CCAAATGGTA TCACAATAGG TAAAGACGCA ATTATTAACA 840 

CTAATGGCTT TACGGCTTCT ACGCTAGACA TTTCTAACGA AAACATCAAG GCGCGTAATT 900 

TCACCTTCGA GCAAACCAAA GATAAAGCGC TCGCTGAAAT TGTGAATCAC GGTTTAATTA 960 

CTGTCGGTAA AGACGGCAGT GTAAATCTTA TTGGTGGCAA AGTGAAAAAC GAGGGTGTGA 1020 

TTAGCGTAAA TGGTGGCAGC ATTTCTTTAC TCGCAGGGCA AAAAATCACC ATCAGCGATA 1080 

TAATAAACCC AACCATTACT TACAGCATTG CCGCGCCTGA AAATGAAGCG GTCAATCTGG 1140 

GCGATATTTT.TGCCAAAGGC GGTAACATTA ATGTCCGTGC TGCCACTATT CGAAACCAAG 1200 

CTTTCCGCCA AAGAGGGTGA AGCGGAAATT GGCGGTGTAA TTTCCGCTCA AAATCAGCAA 1260 

GCTAAAGGCG GCAAGCTGAT GATTACAGGC GATAAAGTCA CATTAAAAAC AGGTGCAGTT 1320 

ATCGACCTTT CAGGTAAAGA AGGGGGAGAA ACTTACCTTG GCGGTGACGA GCGCGGCGAA 1380 

GGTAAAAACG GCATTCAATT AGCAAAGAAA ACCTCTTTAG AAAAAGGCTC AACGATCAAT 1440 

GTATCAGGCA AAGAAAAAGG CGGACGCGCT ATTGTGTGGG GCGATATTCC GTTAATTGAC IS 00 
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GGCAATATTA ACGCTCAAGG TAGTGGTGAT ATCGCTAAAA CCGGTGGTTT TGTGGAGACG 
TCGGGGCATG ATTTATTCAT CAAAGACAAT GCAATTGTTG ACGCCAAAGA GTGGTTGTTA 
GACCCGGATA ATGTATCTAT TAATGCAGAA ACAGCAGGAC GCAGCAATAC TTCAGAAGAC 
GATGAATACA CGGGATCCGG GAATAGTGCC AGCACCCCAA AACGAAACAA AGAAAAGACA 
ACATTAACAA ACACAACTCT TGAGAGTATA CTAAAAAAAG GTACCTTTGT TAACATCACT 
GCTAATCAAC GCATCTATGT CAATAGCTCC ATTAATTTAT CCAATGGCAG CTTAACTCTT 
TGGAGTGAGG GTCGGAGCGG TGGCGGCGTT GAGATTAACA ACGATATTAC CACCGGTGAT 
GATACCAGAG GTGCAAACTT AACAATTTAC TCAGGCGGCT GGGTTGATGT TCATAAAAAT 
ATCTCACTCG GGGCGCAAGG TAACATAAAC ATTACAGCTA AACAAGATAT CGCCTTTGAG 
AAAGGAAGCA ACCAAGTCAT TACAGGTCAA GGGACTATTA CCTCAGGCAA TCAAAAAGGT 
TTTAGATTTA ATAATGTCTC TCTAAACGGC ACTGGCAGCG GACTGCAATT CACCACTAAA 
AGAACCAATA AATACGCTAT CACAAATAAA TTTGAAGGGA CTTTAAATAT TTCAGGGAAA 
GTGAACATCT CAATGGTTTT ACCTAAAAAT GAAAGTGGAT ATGATAAATT CAAAGGACGC 
ACTTACTGGA ATTTAACCTC GAAAGTGGAT ATGATAAATT CAAAGGACGC CCTCACTATT 
GACTCCAGAG GAAGCGATAG TGCAGGCACA CTTACCCAGC CTTATAATTT AAACGGTATA 
TCATTCAACA AAGACACTAC CTTTAATGTT GAACGAAATG CAAGAGTCAA CTTTGACATC 
AAGGCACCAA TAGGGATAAA TAAGTATTCT AGTTTGAATT ACGCATCATT TAATGGAAAC 
ATTTCAGTTT CGGGAGGGGG GAGTGTTGAT TTCACACTTC TCGCCTCATC CTCTAACGTC 
CAAACCCCCG GTGTAGTTAT AAATTCTAAA TACTTTAATG TTTCAACAGG GTCAAGTTTA 
AGATTTAAAA CTTCAGGCTC AACAAAAACT GGCTTCTCAA TAGAGAAAGA TTTAACTTTA 
AATGCCACCG GAGGCAACAT AACACTTTTG CAAGTTGAAG GCACCGATGG AATGATTGGT 
AAAGGCATTG TAGCCAAAAA AAACATAACC TTTGAAGGAG GTAAGATGAG GTTTGGCTCC * 
AGGAAAGCCG TAACAGAAAT CGAAGGCAAT GTTACTATCA ATAACAACGC TAACGTCACT 
CTTATCGGTT CGGATTTTGA CAACCATCAA AAACCTTTAA CTATTAAAAA AGATGTCATC 
ATTAATAGCG GCAACCTTAC CGCTGGAGGC AATATTGTCA ATATAGCCGG AAATCTTACC 
GTTGAAAGTA ACGCTAATTT CAAAGCTATC ACAAATTTCA CTTTTAATGT AGGCGGCTTG 
TTTGACAACA AAGGCAATTC AAATATTTCC ATTGCCAAAG GAGGGGCTCG CTTTAAAGAC 
ATTGATAATT CCAAGAATTT AAGCATCACC ACCAACTCCA GCTCCACTTA CCGCACTATT 
ATAAGCGGCA ATATAACCAA TAAAAACGGT GATTTAAATA TTACGAACGA AGGTAGTCAT 

actgaaatgc aaattggcgg cgatgtctcg caaaaagaag gtaatctcac gatttcttct 

GACAAAATCA ATATTACCAA ACAGATAACA ATCAAGGCAG GTGTTGATGG GGAGAATTCC 
GATTCAGACG CGACAAACAA TGCCAATCTA ACCATTAAAA CCAAAGAATT GAAATTAACG 
CAAGACCTAA ATATTTCAGG TTTCAATAAA GCAGAGATTA CAGCTAAAGA TGGTAGTGAT 
TTAACTATTG GTAACACCAA TAGTGCTGAT GGTACTAATG CCAAAAAAGT aacctttaac 



1S60 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 
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CAGGTTAAAG ATTCAAAAAT CTCTGCTGAC GGTCACAAGG TGACACTACA CAGCAAAGTG 3 600 

GAAACATCCG GTAGTAATAA CAACACTGAA GATAGCAGTG ACAATAATGC CGGCTTAACT 3660 

ATCGATGCAA AAAATGTAAC AGTAAACAAC AATATTACTT CTCACAAAGC AGTGAGCATC 3720 

TCTGCGACAA GTGGAGAAAT TACCACTAAA ACAGGTACAA CCATTAACGC AACCACTGGT 3780 

AACGTGGAGA TAACCGCTCA AACAGGTAGT ATCCTAGGTG GAATTGAGTC CAGCTCTGGC 384 0' 

TCTGTAACAC TTACTGCAAC CGAGGGCGCT CTTGCTGTAA GCAATATTTC GGGCAACACC 3 900 

GTTACTGTTA CTGCAAATAG CGGTGCATTA ACCACTTTGG CAGGCTCTAC AATTAAAGGA 3 96 0 

ACCGAGAGTG TAACCACTTC AAGTCAATCA GGCGATATCG GCGGTACGAT TTCTGGTGGC 4 020 

ACAGTAGAGG TTAAAGCAAC CGAAAGTTTA ACCACTCAAT CCAATTCAAA AATTAAAGGA 4080 

ACAACAGGCG AGGCTAACGT AACAAGTGCA ACAGGTACAA TTGGTGGTAC GATTTCCGGT 4140 

AATACGGTAA ATGTTACGGC AAACGCTGGC GATTTAACAG TTGGGAATGG CGCAGAAATT 4200 

AATGCGACAG AAGGAGCTGC AACCTTAACT ACATCATCGG GCAAATTAAC TACCGAAGCT 42 60 

AGTTCACACA TTACTTCAGC CAAGGGTCAG GTAAATCTTT CAGCTCAGGA TGGTAGCGTT 4320 

GCAGGAAGTA TTAATGCCGC CAATGTGACA CTAAATACTA CAGGCACTTT AACTACCGTG 4380 

AAGGGTTCAA ACATTAATGC AACCAGCGGT ACCTTGGTTA TTAACGCAAA AGACGCTGAG 4440 

CTAAATGGCG CAGCATTGGG TAACCACACA GTGGTAAATG CAACCAACGC AAATGGCTCC 4500 

GGCAGCGTAA TCGCGACAAC CTCAAGCAGA GTGAACATCA CTGGGGATTT AATCACAATA 4560 

AATGGATTAA ATATCATTTC AAAAAACGGT ATAAACACCG TACTGTTAAA AGGCGTTAAA 4620 

ATTGATGTGA AATACATTCA ACCGGGTATA GCAAGCGTAG ATGAAGTAAT TGAAGCGAAA 4680 

CGCATCCTTG AGAAGGTAAA AGATTTATCT GATGAAGAAA GAGAAGCGTT AGCTAAACTT 4740 

GGCGTAAGTG CTGTACGTTT TATTGAGCCA AATAATACAA TTACAGTCGA TACACAAAAT 4800 

GAATTTGCAA CCAGACCATT AAGTCGAATA GTGATTTCTG AAGGCAGGGC GTGTTTCTCA. 4860 

AACAGTGATG GCGCGACGGT GTGCGTTAAT ATCGCTGATA ACGGGCGGTA GCGGTCAGTA 4920 

ATTGACAAGG TAGATTTCAT CCTGCAATGA AGTCATTTTA TTTTCGTATT ATTTACTGTG 4 980 

TGGGTTAAAG TTCAGTACGG GCTTTACCCA TCTTGTAAAA AATTACGGAG AATACAATAA 5040 

AGTATTTTTA ACAGGTTATT ATTATGAAAA ATATAAAAAG CAGATTAAAA CTCAGTGCAA 5100 

TATCAGTATT GCTTGGCCTG GCTTCTTCAT CATTGTATGC AGAAGAAGCG TTTTTAGTAA 5160 

AAGGCTTTCA GTTATCTGGT GCACTTGAAA CTTTAAGTGA AGACGCCCAA CTGTCTGTAG 5220 

CAAAATCTTT ATCTAAATAC CAAGGCTCGC AAACTTTAAC AAACCTAAAA ACAGCACAGC 5280 

TTGAATTACA GGCTGTGCTA GATAAGATTG AGCCAAATAA GTTTGATGTG ATATTGCCAC 5340 

AACAAACCAT TACGGATGGC AATATTATGT TTGAGCTAGT CTCGAAATCA GCCGCAGAAA 5400 

GCCAAGTTTT TTATAAGGCG AGCCAGGGTT ATAGTGAAGA AAATATCGCT CGTAGCCTGC 5460 

CATCTTTGAA ACAAGGAAAA GTGTATGAAG ATGGTCGTCA GTGGTTCGAT TTGCGTGAAT 5520 

TCAATATGGC AAAAGAAAAT CCACTTAAAG TCACTCGCGT GCATTACGAG TTAAACCCTA 5580 
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AAAACAAAAC CTCTGATTTG GTAGTTGCAG GTTTTTCGCC TTTTGGCAAA ACGCGTAGCT 5S4 0 

TTGTTTCCTA TGATAATTTC GGCGCAAGGG AGTTTAACTA TCAACGTGTA AGTCTAGGTT 57 00 

TTGTAAATGC CAATTTGACC GGACATGATG ATGTATTAAA TCTAAACGCA TTGACCAATG. 5760 

TAAAAGCACC ATCAAAATCT TATGCGGTAG GCATAGGATA TACTTATCCG TTTTATGATA 582 0 

AACACCAATC CTTAAGTCTT TATACCAGCA TGAGTTATGC TGATTCTAAT GATATCGACG 58 80 

GCTTACCAAG TGCGATTAAT CGTAAATTAT CAAAAGGTCA ATCTATCTCT GCGAATCTGA 594 0 

AATGGAGTTA TTATCTCCCG ACATTTAACC TTGGAATGGA AGACCAGTTT AAAATTAATT SOOO 

TAGGCTACAA CTACCGCCAT ATTAATCAAA CATCCGAGTT AAACACCCTG GGTGCAACGA 6060 

AGAAAAAATT TGCAGTATCA GGCGTAAGTG CAGGCATTGA TGGACATATC CAATTTACCC 6120 

CTAAAACAAT CTTTAATATT GATTTAACTC ATCATTATTA CGCGAGTAAA TTACCAGGCT 6180 

CTTTTGGAAT GGAGCGCATT GGCGAAACAT TTAATCGCAG CTATCACATT AGCACAGCCA 6240 

GTTTAGGGTT GAGTCAAGAG TTTGCTCAAG GTTGGCATTT TAGCAGTCAA TTATCGGGTC 6300 

AGTTTACTCT ACAAGATATA AGTAGCATAG . ATTTATTCTC TGTAACAGGT ACTTATGGCG 6360 

TCAGAGGCTT TAAATACGGC GGTGCAAGTG GTGAGCGCGG TCTTGTATGG CGTAATGAAT 6420 

TAAGTATGCC AAAATACACC CGCTTTCAAA TCAGCCCTTA TGCGTTTTAT GATGCAGGTC 6480 

AGTTCCGTTA TAATAGCGAA AATGCTAAAA CTTACGGCGA AGATATGCAC ACGGTATCCT 6540 

CTGCGGGTTT AGGCATTAAA ACCTCTCCTA CACAAAACTT AAGCTTAGAT GCTTTTGTTG 66 00 

CTCGTCGCTT TGCAAATGCC AATAGTGACA ATTTGAATGG CAACAAAAAA CGCACAAGCT 6660 

CACCTACAAC CTTCTGGGGT AGATTAACAT TCAGTTTCTA ACCCTGAAAT TTAATCAACT 6720 

GGTAAGCGTT CCGCCTACCA GTTTATAACT ATATGCTTTA CCCGCCAATT TACAGTCTAT 6780 

ACGCAACCCT GTTTTCATCC TTATATATCA AACAAACTAA GCAAACCAAG CAAACCAAGC 6840 

AAACCAAGCA AACCAAGCAA ACCAAGCAAA CCAAGCAAAC CAAGCAAACC AAGCAAACCA . 6900 

AGCAAACCAA GCAAACCAAG CAAACCAAGC 7UUVCCAAGCA ATGCTAAAAA ACAATTTATA 6960 

TGATAAACTA AAACATACTC CATACCATGG CAATACAAGG GATTTAATAA TATGACAAAA 7020 

GAAAATTTAC AAAGTGTTCC ACAAAATACG ACCGCTTCAC TTGTAGAATC AAACAACGAC 7080 

CAAACTTCCC TGCAAATACT TAAACAACCA CCCAAACCCA ACCTATTACG CCTGGAACAA 7140 

CATGTCGCCA AAAAAGATTA TGAGCTTGCT TGCCGCGAAT TAATGGCGAT TTTGGAAAAA 7200 

ATGGACGCTA ATTTTGGAGG CGTTCACGAT ATTGAATTTG ACGCACCTGC TCAGCTGGCA 7260 

TATCTACCCG AAAAACTACT AATTCATTTT GCCACTCGTC TCGCTAATGC AATTACAACA 7320 

CTCTTTTCCG ACCCCGAATT GGCAATTTCC GAAGAAGGGG CATTAAAGAT GATTAGCCTG 7380 

CAACGCTGGT TGACGCTGAT TTTTGCCTCT TCCCCCTACG TTAACGCAGA CCATATTCTC 7440 

AATAAATATA ATATCAACCC AGATTCCGAA GGTGGCTTTC ATTTAGCAAC AGACAACTCT 7500 

TCTATTGCTA AATTCTGTAT TTTTTACTTA CCCGAATCCA ATGTCAATAT GAGTTTAGAT 7560 

GCGTTATGGG CAGGGAATCA ACAACTTTGT GCTTCATTGT GTTTTGCGTT GCAGTCTTCA 7620 
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CGTTTTATTG GTACTGCATC TGCGTTTCAT AAAAGAGCGG TGGTTTTACA GTGGTTTCCT 76 80 

AAAAAACTCG CCGAAATTGC TAATTTAGAT GAATTGCCTG CAAATATCCT TCATGATGTA 7740 

TATATGCACT GCAGTTATGA TTTAGCAAAA AACAAGCACG ATGTTAAGCG TCCATTAAAC 78 00 

GAACTTGTCC GCAAGCATAT CCTCACGCAA GGATGGCAAG ACCGCTACCT TTACACCTTA 78 SO 

GGTAAAAAGG ACGGCAAACC TGTGATGATG GTACTGCTTG AACATTTTAA TTCGGGACAT 7 92 0 

TCGATTTATC GCACGCATTC AACTTCAATG ATTGCTGCTC GAGAAAAATT CTATTTAGTC 7980 

GGCTTAGGCC ATGAGGGCGT TGATAACATA GGTCGAGAAG TGTTTGACGA GTTCTTTGAA 8040 

ATCAGTAGCA ATAATATAAT GGAGAGACTG TTTTTTATCC GTAAACAGTG CGAAACTTTC 8100 

CAACCCGCAG TGTTCTATAT GCCAAGCATT GGCATGGATA TTACCACGAT TTTTGTGAGC 8160 

AACACTCGGC TTGCCCCTAT TCAAGCTGTA GCCTTGGGTC ATCCrGCCAC TACGCATTCT 8220 

GAATTTATTG ATTATGTCAT CGTAGAAGAT GATTATGTGG GCAGTGAAGA TTGTTTTAGC 82 80 

GAAACCCTTT TAQGCTTACC CAAAGATGCC CTACCTTATG TACCATCTGC ACTCGCCCCA 83 4 0 

CAAAAAGTGG ATTATGTACT CAGGGAAAAC CCTGAAGTAG TCAATATCGG TATTGCCGCT 8400 

ACCACAATGA AATTAAACCC TGAATTTTTG CTAACATTGC AAGAAATCAG AGATAAAGCT 8460 

AAAGTCAAAA TACATTTTCA TTTCGCACTT GGACAATCAA CAGGCTTGAC ACACCCTTAT 8520 

GTCAAATGGT TTATCGAAAG CTATTTAGGT GACGATGCCA CTGCACATCC CCACGCACCT 8580 

TATCACGATT ATCTGGCAAT ATTGCGTGAT TGCGATATGC TACTAAATCC GTTTCCTTTC 8 640 

GGTAATACTA ACGGCATAAT TGATATGGTT ACATTAGGTT TAGTTGGTGT ATGCAAAACG 8700 

GGGGATGAAG TACATGAACA TATTGATGAA GGTCTGTTTA AACGCTTAGG ACTACCAGAA 8760 

TGGCTGATAG CCGACACACG AGAAACATAT ATTGAATGTG CTTTGCGTCT AGCAGAAAAC 8820 

CATCAAGAAC GCCTTGAACT CCGTCGTTAC ATCATAGAAA ACAACGGCTT ACAAAAGCTT 8880 

TTTACAGGCG ACCCTCGTCC ATTGGGCAAA ATACTGCTTA AGAAAACAAA TGAATGGAAG 8940 

CGGAAGCACT TGAGTAAAAA ATAACGGTTT TTTAAAGTAA AAGTGCGGTT AATTTTCAAA 9000 

GCGTTTTAAA AACCTCTCAA AAATCAACCG CACTTTTATC TTTATAACGC TCCCGCGCGC 9060 

TGACAGTTTA TCTCTTTCTT AAAATACCCA TAAAATTGTG GCAATAGTTG GGTAATCAAA 9120 

TTCAATTGTT GATACGGCAA ACTAAAGACG GCGCGTTCTT CGGCAGTCAT C 9171 
(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9323 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 

CGCCACTTCA ATTTTGGATT GTTGAAATTC AACTAACCAA AAAGTGCGGT TAAAATCTGT 60 
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GGAGAAAATA GGTTGTAGTG AAGAACGAGG TAATTGTTCA AAAGGATAAA GCTCTCTTAA 
TTGGGCATTG GTTGGCGTTT CTTTTTCGGT TAATAGTAAA TTATATTCTG GACGACTATG 
CAATCCACCA ACAACTTTAC CGTTGGTTTT AAGCGTTAAT GTAAGTTCTT GCTCTTCTTG ' 
GCGAATACGT AATCCCATTT TTTGTTTAGC AAGAAAATGA TCGGGATAAT CATAATAGGT 
GTTGCCCAAA AATAAATTTT GATGTTCTAA AATCATAAAT TTTGCAAGAT ATTGTGGCAA 
TTCAATACCT ATTTGTGGCG AAATCGCCAA TTTTAATTCA ATTTCTTGTA GCATAATATT 
TCCCACTCAA ATCAACTGGT TAAATATACA AGATAATAAA AATAAATCAA GATTTTTGTG 
ATGACAAACA ACAATTACAA CACCTTTTTT GCAGTCTATA TGCAAATATT TTAAAAAAAT 
AGTATAAATC CGCCATATAA AATGGTATAA TCTTTCATCT TTCATCTTTC ATCTTTCATC 
TTTCATCTTT CATCTTTCAT CTTTCATCTT TCATCTTTCA TCTTTCATCT TTCATCTTTC 
ATCTTTCATC TTTCATCTTT CACATGAAAT GATGAACCGA GGGAAGGGAG GGAGGGGCAA 
GAATGAAGAG GGAGCTGAAC GAACGCAAAT GATAAAGTAA TTTAATTGTT CAACTAACCT 
TAGGAGAAAA TATGAACAAG ATATATCGTC TCAAATTCAG CAAACGCCTG AATGCTTTGG 
TTGCTGTGTC TGAATTGGCA CGGGGTTGTG ACCATTCCAC AGAAAAAGGC AGCGAAAAAC 
CTGCTCGCAT GAAAGTGCGT CACTTAGCGT TAAAGCCACT TTCCGCTATG TTACTATCTT 
TAGGTGTAAC ATCTATTCCA CAATCTGTTT TAGCAAGCGG CMTTTAACA TCGACCAAAA 
TGAAATGGTG CAGTTTTTAC AAGAAAACAA GTAATAAAAC CATTATCCGC AACAGTGTTG 
ACGCTATCAT TAATTGGAAA CAATTTAACA TCGACCAAAA TGAAATGGTG CAGTTTTTAC 
AAGAAAACAA CAACTCCGCC GTATTCAACC GTGTTACATC TAACCAAATC TCCCAATTAA 
AAGGGATTTT AGATTCTAAC GGACAAGTCT TTTTAATCAA CCCAAATGGT ATCACAATAG 
GTAAAGACGC AATTATTAAC ACTAATGGCT TTACGGCTTC TACGCTAGAC ATTTCTAACG 
AAAACATCAA GGCGCGTAAT TTCACCTTCG AGCAAACCAA AGATAAAGCG CTCGCTGAAA * 
TTGTGAATCA CGGTTTAATT ACTGTCGGTA AAGACGGCAG TGTAAATCTT ATTGGTGGCA 
AAGTGAAAAA CGAGGGTGTG ATTAGCGTAA ATGGTGGCAG CATTTCTTTA CTCGCAGGGC 
AAAAAATCAC CATCAGCGAT ATAATAAACC CAACCATTAC TTACAGCATT GCCGCGCCTG 
AAAATGAAGC GGTCAATCTG GGCGATATTT TTGCCAAAGG CGGTAACATT AATGTCCGTG 
CTGCCACTAT TCGAAACCAA GGTAAACTTT CTGCTGATTC TCTAAGCAAA GATAAAAGCG 
GCAATATTGT TCTTTCCGCC AAAGAGGGTG AAGCGGAAAT TGGCGGTGTA ATTTCCGCTC 
AAAATCAGCA AGCTAAAGGC GGCAAGCTGA TGATAAAGTC CGATAAAGTC ACATTAAAAA 
CAGGTGCAGT TATCGACCTT TCAGGTAAAG AAGGGGGAGA AACTTACCTT GGCGGTGACG 
AGCGCGGCGA AGGTAAAAAC GGCATTCAAT TAGCAAAGAA AACCTCTTTA GAAAAAGGCT 
CAACCATCAA TGTATCAGGC AAAGAAAAAG GCGGACGCGC TATTGTGTGG GGCGATATTG 
CGTTAATTGA CGGCAATATT AACGCTCAAG GTAGTGGTGA TATCGCTAAA ACCGGTCGTT 
TTGTGGAGAC ATCGGGGCAT TATTTATCCA TTGACAGCAA TCCAATTOTT AAAACAAAAG 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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AGTGGTTGCT AGACCCTGAT GATGTAACAA TTGAAGCCGA AGACCCCCTT CGCAATAATA 
CCGGTATAAA TGATGAATTC GCAACAGGCA CCGGTGAAGC AAGCGACCCT AAAAAAAATA 
GCGAACTCAA AACAACGCTA ACCAATACAA CTATTTCA.^ TTATCTGAAA AACGCCTGGA 
CAATGAATAT AACGGCATCA AGAAAACTTA CCGTTAATAG CTCAATCAAC ATCGGAAGCA 
ACTCCCACTT AATTCTCCAT AGTAAAGGTC AGCGTGGCGG AGGCGTTCAG ATTGATGGAG 
ATATTACTTC TAAAGGCGGA AATTTAACCA TTTATTCTGG CGGATGGGTT GATGTTCATA 
AAAATATTAC GCTTGATCAG GGTTTTTTAA ATATTACCGC CGCTTCCGTA GCTTTTGAAG 
GTGGAAATAA CAAAGCACGC GACGCGGCAA ATGCTAAAAT TGTCGCCCAG GGCAGTGTAA 
CCATTACAGG AGAGGGAAAA GATTTCAGGG CTAACAACGT ATCTTTAAAC GGAACGGGTA 
AAGGTCTGAA TATCATTTCA TCAGTGAATA ATTTAACCCA CAATCTTAGT GGCACAATTA 
ACATATCTGG GAATATAACA ATTAACCAAA CTACGAGAAA GAACACCTCG TATTGGCAAA 
CCAGCCATGA TTCGCACTGG AACGTCAGTG CTCTTAATCT AGAGACAGGC GCAAATTTTA 
CCTTTATTAA ATACATTTCA AGCAATAGCA AAGGCTTAAC AACACAGTAT AGAAGCTCTG 
CAGGGGTGAA TTTTAACGGC GTAAATGGCA ACATGTCATT CAATCTCAAA GAAGGAGCGA 
AAGTTAATTT CAAATTAAAA CCAAACGAGA ACATGAACAC AAGCAAACCT TTACCAATTC 
GGTTTTTAGC CAATATCACA GCCACTGGTG GGGGCTCTGT TTTTTTTGAT ATATATGCCA 
ACCATTCTGG CAGAGGGGCT GAGTTAAAAA TGAGTGAAAT TAATATCTCT AACGGCGCTA 
ATTTTACCTT AAATTCCCAT GTTCGCGGCG ATGACGCTTT TAAAATCAAC AAAGACTTAA 
CCATAAATGC AACCAATTCA AATTTCAGCC TCAGACAGAC GAAAGATGAT TTTTATGACG 
GGTACGCACG CAATGCCATC AATTCAACCT ACAACATATC CATTCTGGGC GGTAATGTCA 
CCCTTGGTGG ACAAAACTCA AGCAGCAGCA TTACGGGGAA TATTACTATC GAGAAAGCAG 
CAAATGTTAC GCTAGAAGCC AATAACGCCC CTAATCAGCA AAACATAAGG GATAGAGTTA 
TAAAACTTGG CAGCTTGCTC GTTAATGGGA GTTTAAGTTT AACTGGCGAA AATGCAGATA 
TTAAAGGCAA TCTCACTATT TCAGAAAGCG CCACTTTTAA AGGAAAGACT AGAGATACCC 
TAAATATCAC CGGCAATTTT ACCAATAATG GCACTGCCGA AATTAATATA ACACAAGGAG 
TGGTAAAACT TGGCAATGTT ACCAATGATG GTGATTTAAA CATTACCACT CACGCTAAAC 
GCAACCAAAG AAGCATCATC GGCGGAGATA TAATCAACAA AAAAGGAAGC TTAAATATTA 
CAGACAGTAA TAATGATGCT GAAATCCAAA TTCGCGGCAA TATCTCGCAA AAAGAAGGCA 
ACCTCACGAT TTCTTCCGAT AAAATTAATA TCACCAAACA GATAACAATC AAAAAGGGTA 
TTGATGGAGA GGACTCTAGT TCAGATGCGA CAAGTAATCC CAACCTAACT ATTAAAACCA 
AAGAATTGAA ATTCACAGAA GACCTAAGTA TTTCAGGTTT CAATAAAGCA GAGATTACAG 
CCAAAGATGG TAGAGATTTA ACTATTGGCA ACAGTAATGA CGGTAACAGC GGTGCCGAAG 
CCAAAACAGT AACTTTTAAC AATGTTAAAG ATTCAAAAAT CTCTGCTGAC GGTCACAATG 
TGACACTAAA TAGCAAAGTG AAAACATCTA GCAGCAATGG CGGACGTCAA AGCAATAGCG 
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ACAACGATAC CGGCTTAACT ATTACTGCAA AAAATGTAGA AGTAAACAAA GATATTACTT 4 2 00 

CTCTCAAAAC AGTAAATATC ACCGCGTCGG- AAAAGGTTAC CACCACAGCA GGCTCGACCA 4 260 

TTAACGCAAC AAATGGCAAA GCAAGTATTA CAACCAAAAC AGGTGATATC AGCGGTACGA 4 320 ' 

TTTCCGGTAA CACGGTAAGT GTTAGCGCGA CTGGTGATTT AACCACTAAA TCCGGCTCAA 4 380 

AAATTGAAGC GAAATCGGGT GAGGCTAATG TAACAAGTGC AACAGGTACA ATTGGCGGTA 4 440 

CAATTTCCGG TAATACGGTA AATGTTACGG CAAACGCTGG CGATTTAACA GTTGGGAATG 4 500 

GCGCAGAAAT TAATGCGACA GAAGGAGCTG CAACCTTAAC CGCAACAGGG AATACCTTGA 4 56 0 

CTACTGAAGC CGGTTCTAGC ATCACTTCAA CTAAGGGTCA GGTAGACCTC TTGGCTCAGA 4 620 

ATGGTAGCAT CGCAGGAAGC ATTAATGCTG CTAATGTGAC ATTAAATACT ACAGGCACCT 4 680 

TAACCACCGT GGCAGGCTCG GATATTAAAG CAACCAGCGG CACCTTGGTT ATTAACGCAA 4 740 

AAGATGCTAA GCTAAATGGT GATGCATCAG GTGATAGTAC AGAAGTGAAT GCAGTCAACG 4 800 

ACTGGGGATT TGGTAGTGTG ACTGCGGCAA CCTCAAGCAG TGTGAATATC ACTGGGGATT 4 860 

TAAACACAGT AAATGGGTTA AATATCATTT CGAAAGATGG TAGAAACACT GTGCGCTTAA 4 920 

GAGGCAAGGA AATTGAGGTG AAATATATCC AGCCAGGTGT AGCAAGTGTA GAAGAAGTAA 4 980 

TTGAAGCGAA ACGCGTCCTT GAAAAAGTAA AAGATTTATC TGATGAAGAA AGAGAAACAT 5040 

TAGCTAAACT TGGTGTAAGT GCTGTACGTT TTGTTGAGCC AAATAATACA ATTACAGTCA 5100 

ATACACAAAA TGAATTTACA ACCAGACCGT CAAGTCAAGT GATAATTTCT GAAGGTAAGG 5160 

CGTGTTTCTC AAGTGGTAAT GGCGCACGAG TATGTACCAA TGTTGCTGAC GATGGACAGC 5220 

CGTAGTCAGT AATTGACAAG GTAGATTTCA TCCTGCAATG AAGTCATTTT ATTTTCGTAT 5280 

TATTTACTGT GTGGGTTAAA GTTCAGTACG GGCTTTACCC ATCTTGTAAA AAATTACGGA 5340 

GAATACAATA AAGTATTTTT AACAGGTTAT TATTATGAAA AATATAAAAA GCAGATTAAA 5400 

ACTCAGTGCA ATATCAGTAT TGCTTGGCCT GGCTTCTTCA TCATTGTATG CAGAAGAAGC 5460 

GTTTTTAGTA AAAGGCTTTC AGTTATCTGG TGCACTTGAA ACTTTAAGTG AAGACGCCCA 5 520 

ACTGTCTGTA GCAAAATCTT TATCTAAATA CCAAGGCTCG CAAACTTTAA CAAACCTAAA 5580 

AACAGCACAG CTTGAATTAC AGGCTGTGCT AGATAAGATT GAGCCAAATA AATTTGATGT 5640 

GATATTGCCG CAACAAACCA TTACGGATGG CAATATCATG TTTGAGCTAG TCTCGAAATC 5700 

AGCCGCAGAA AGCCAAGTTT TTTATAAGGC GAGCCAGGGT TATAGTGAAG AAAATATCGC 5760 

TCGTAGCCTG CCATCTTTGA AACAAGGAAA AGTGTATGAA GATGGTCGTC AGTGGTTCGA 5820 

TTTGCGTGAA TTTAATATGG CAAAAGAAAA CCCGCTTAAG GTTACCCGTG TACATTACGA 5880 

ACTAAACCCT AAAAACAAAA CCTCTAATTT GATAATTGCG GGCTTCTCX3C CTTTTGGTAA 5940 

AACGCGTAGC TTTATTTCTT ATGATAATTT CGGCGCGAGA GAGTTTAACT ACCAACGTGT 6000 

AAGCTTGGGT TTTGTTAATG CCAATTTAAC TGGTCATGAT GATGTGTTAA TTATACCAGT 6060 

ATGAGTTATG CTGATTCTAA TGATATCGAC GGCTTACCAA GTGCGATTAA TCGTAAATTA 6120 

TCAAAAGGTC AATCTATCTC TGCGAATCTG AAATGGAGTT ATTATCTCCC AACATTTAAC 6180 
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CTTGGCATGG AAGACCAATT TAAAATTAAT TTAGGCTACA ACTACCGCCA TATTAATCAA 6240 

ACCTCCGCGT TAAATCGCTT GGGTGAAACG AAGAAAAAAT TTGCAGTATC AGGCGTAAGT 6300 

GCAGGCATTG ATGGACATAT CCAATTTACC CCTAAAACAA TCTTTAATAT TGATTTAACT • 6360 

CATCATTATT ACGCGAGTAA ATTACCAGGC TCTTTTGGAA TGGAGCGCAT TGGCGAAACA 6420 

TTTAATCGCA GCTATCACAT TAGCACAGCC AGTTTAGGGT TGAGTCAAGA GTTTGCTCAA 64 80 

GGTTGGCATT TTAGCAGTCA ATTATCAGGT CAATTTACTC TACAAGATAT TAGCAGTATA 6 540 

GATTTATTCT CTGTAACAGG TACTTATGGC GTCAGAGGCT TTAAATACGG CGGTGCAAGT 6600 

GGTGAGCGCG GTCTTGTATG GCGTAATGAA TTAAGTATGC CAAAATACAC CCGCTTCCAA G660 

ATCAGCCCTT ATGCGTTTTA TGATGCAGGT CAGTTCCGTT ATAATAGCGA AAATGCTAAA 6720 

ACTTACGGCG AAGATATGCA CACGGTATCC TCTGCGGGTT TAGGCATTAA AACCTCTCCT 678 0 

ACACAAAACT TAAGCCTAGA TGCTTTTGTT GCTCGTCGCT TTGCAAATGC CAATAGTGAC 6840 

AATTTGAATG GCAACAAAAA ACGCACAAGC TCACCTACAA CCTTCTGGGG GAGATTAACA 6900 

TTCAGTTTCT AACCCTGAAA TTTAATCAAC TGGTAAGCGT TCCGCCTACC AGTTTATAAC 6960 

TATATGCTTT ACCCGCCAAT TTACAGTCTA TAGGCAACCC TGTTTTTACC CTTATATATC 7020 

AAATAAACAA GCTAAGCTGA GCTAAGCAAA CCAAGCAAAC TCAAGCAAGC CAAGTAATAC 7080 

TAAAAAAACA ATTTATATGA TAAACTAAAG TATACTCCAT GCCATGGCGA TACAAGGGAT 7140 

TTAATAATAT GACAAAAGAA AATTTGCAAA ACGCTCCTCA AGATGCGACC GCTTTACTTG 7200 

CGGAATTAAG CAACAATCAA ACTCCCCTGC GAATATTTAA ACAACCACGC AAGCCCAGCC 7260 

TATTACGCTT GGAACAACAT ATCGCAAAAA AAGATTATGA GTTTGCTTGT CGTGAATTAA 7320 

TGGTGATTCT GGAAAAAATG GACGCTAATT TTGGAGGCGT TCACGATATT GAATTTGACG 73 80 

CACCCGCTCA GCTGGCATAT CTACCCGAAA AATTACTAAT TTATTTTGCC ACTCGTCTCG 7440 

CTAATGCAAT TACAACACTC TTTTCCGACC CCGAATTGGC AATTTCTGAA GAAGGGGCGT ' 7500 

TAAAGATGAT TAGCCTGCAA CGCTGGTTGA CGCTGATTTT TGCCTCTTCC CCCTACGTTA 7560 

ACGCAGACCA TATTCTCAAT AAATATAATA TCAACCCAGA TTCCGAAGGT GGCTTTCATT 7620 

TAGCAACAGA CAACTCTTCT ATTGCTAAAT TCTGTATTTT TTACTTACCC GAATCCAATG 7680 

TCAATATGAG TTTAGATGCG TTATGGGCAG GGAATCAACA ACTTTGTGCT TCATTGTGTT 7740 

TTGCGTTGCA GTCTTCACGT TTTATTGGTA CCGCATCTGC GTTTCATAAA AGAGCGGTGG 7800 

TTTTACAGTG GTTTCCTAAA AAACTCGCCG AAATTGCTAA TTTAGATGAA TTGCCTGCAA 7860 

ATATCCTTCA TGATGTATAT ATGCACTGCA GTTATGATTT AGCAAAAAAC AAGCACGATG 7920 

TTAAGCGTCC ATTAAACGAA CTTGTCCX3CA AGCATATCCT CACGCAAGGA TGGCAAGACC 7980 

GCTACCTTTA CACCTTAGGT AAAAAGGACG GCAAACCTGT GATGATGGTA CTGCTTGAAC 8040 

ATTTTAATTC GGGACATTCG ATTTATCGTA CACATTCAAC TTCAATGATT GCTGCTCGAG 8100 

AAAAATTCTA TTTAGTCGGC TTAGGCCATG AGGGCGTTGA TAAAATAGGT CGAGAAGTGT 8160 

TTGACGAGTT CTTTGAAATC AGTAGCAATA ATATAATGGA GAGACTGTTT TTTATCCGTA 8220 



340 

400 



640 
700 
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AACAGTGCGA AACTTTCCAA CCCGCAGTGT TCTATATGCC AAGCATTGGC ATGGATATTA 828 0 
CCACGATTTT TGTGAGCAAC ACTCGGCTTG CCCCTATTCA AGCTGTAGCC CTGGGTCATC 
CTGCCACTAC GCATTCTGAA TTTATTGATT ATGTCATCGT AGAAGATGAT TATGTGGGCA 
GTGAAGATTG TTTCAGCGAA ACCCTTTTAC GCTTACCCAA AGATGCCCTA CCTTATGTAC 846 0 
CTTCTGCACT CGCCCCACAA AAAGTGGATT ATGTACTCAG GGAAAACCCT GAAGTAGTCA 852 0 
ATATCGGTAT TGCCGCTACC ACAATGAAAT TAAACCCTGA ATTTTTGCTA ACATTGCAAG 8580 
AAATCAGAGA TAAAGCTAAA GTCAAAATAC ATTTTCATTT CGCACTTGGA CAATCAACAG 
GCTTGACACA CCCTTATGTC AAATGGTTTA TCGAAAGCTA TTTAGGTGAC GATGCCACTG 
CACATCCCCA CGCACCTTAT CACGATTATC TGGCAATATT GCGTGATTGC GATATGCTAC 8760 
TAAATCCGTT TCCTTTCGGT AATACTAACG GCATAATTGA TATGGTTACA TTAGGTTTAG 882 0 
TTGGTGTATG CAAAACGGGG GATGAAGTAC ATGAACATAT TGATGAAGGT CTGTTTAAAC 8880 
GCTTAGGACT ACCAGAATGG CTGATAGCCG ACACACGAGA AACATATATT GAATGTGCTT 8940 
TGCGTCTAGC AGAAAACCAT CAAGAACGCC TTGAACTCCG TCGTTACATC ATAGAAAACA 
ACGGCTTACA AAAGCTTTTT ACAGGCGACC CTCGTCCATT GGGCAAAATA CTGCTTAAGA 
AAACAAATGA ATGGAAGCGG AAGCACTTGA GTAAAAAATA ACGGTTTTTT AAAGTAPlAAG 
TGCGGTTAAT TTTCAAAGCG TTTTAAAAAC GTCTCAAAAA TCAACCGCAC TTTTATCTTT 918 0 
ATAACGATCC CGCACGCTGA CAGTTTATCA GCCTCCCGCC ATAAAACTCC GCCTTTCATG 
GCGGAGATTT TAGCCAAAAC TGGCAGAAAT TAAAGGCTAA AATCACCAAA TTGCACCACA 
AAATCACCAA TACCCACAAA AAA 9333 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ATGAACAAGA TATATCGTCT CAAATTCAGC AAACGCCTGA ATGCTTTGGT TGCTGTGTCT 60 

GAATTGACAC GGGGTTGTGA CCATTCCACA GAAAAAGGCA GTGAAAAACC TGTTCGTACG 120 

AAAGTACGCC ACTTGGCGTT AAAGCCACTT TCCGCTATAT TGCTATCTTT GGGCATGGCA 180 

TCCATTCCGC AATCTGTTTT AGCGAGCGGT TTACAGGGAA TGAGCGTCGT ACACGGTACA 240 

GCAACCATGC AAGTAGACGG CAATAAAACC ACTATCCGTA ATAGCGTCAA TGCTATCATC 300 

AATTGGAAAC AATTTAACAT TGACCAAAAT GAAATGGTGC AGTTTTTACA AGAAAGCAGC 360 

AACTCTGCCG TTTTCAACCG TGTTACATCT GACCAAATCT CCCAATTAAA AGGGATTTTA 420 



9000 
9060 

9120 



9240 
9300 
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GATTCTAACG GACAAGTCTT TTTAATCAAC CCAAATGGTA TCACAATAGG TAAAGACGCA 4 80 

ATTATTAACA CTAATGGCTT TACTGCTTCT ACGCTAGACA TTTCTAACGA AAACATCAAG 54 0 

GCGCGTAATT TCACCCTTGA GCAAACCAAG GATAAAGCAC TCGCTGAAAT CGTGAATCAC 600 
GGTTTAATTA CCGTTGGTAA AGACGGTAGC GTAAACCTTA TTGGTGGCAA AGTGATUU^C 660 
GAGGGCGTGA TTAGCGTAAA TGGCGGTAGT ATTTCTTTAC TTGCAGGGCA AAAAATCACC 72 0 

ATCAGCGATA TAATAAATCC AACCATCACT TACAGCATTG CTGCACCTGA AAACGAAGCG 78 0 

ATCAATCTGG GCGATATTTT TGCCAAAGGT GGTAACATTA ATGTCCGCGC TGCCACTATT 84 0 

CGCAATAAAG GTAAACTTTC TGCCGACTCT GTAAGCAAAG ATAAAAGTGG TAACATTGTT 900 
CTCTCTGCCA AAGAAGGTGA AGCGGAAATT GGCGGTGTAA TTTCCGCTCA AAATCAGCAA 960 

GCCAAAGGTG GTAAGTTGAT GATTACAGGC GATAAAGTTA CATTGAAAAC GGGTGCAGTT 102 0 

ATCGACCTTT CGGGTAAAGA AGGGGGAGAA ACTTATCTTG GCGGTGACGA GCGTGGCGAA 108 0 

GGTAAAAACG GCATTCAATT AGCAAAGAAA ACCACTTTAG AAAAAGGCTC AACAATTAAT 1140 

GTGTCAGGTA AAGAAAAAGG TGGGCGCGCT ATTGTATGGG GCGATATTGC GTTAATTGAC 1200 

GGCAATATTA ATGCCCAAGG TAAAGATATC GCTAAAACTG GTGGTTTTGT GGAGACGTCG 1260 

GGGCATTACT TATCCATTGA TGATAACGCA ATTGTTAAAA CAAAAGAATG GCTACTAGAC 132 0 

CCAGAGAATG TGACTATTGA AGCTCCTTCC GCTTCTCGCG TCGAGCTGGG TGCCGATAGG 13 80 

AATTCCCACT CGGCAGAGGT GATAAAAGTG ACCCTAAAAA AAAATAACAC CTCCTTGACA 1440 

ACACTAACCA ATACAACCAT TTCAAATCTT CTGAAAAGTG CCCACGTGGT GAACATAACG 1500 

GCAAGGAGAA AACTTACCGT TAATAGCTCT ATCAGTATAG AAAGAGGCTC CCACTTAATT 1560 

CTCCACAGTG AAGGTCAGGG CGGTCAAGGT GTTCAGATTG ATAAAGATAT TACTTCTGAA 1620 

GGCGGAAATT TAACCATTTA TTCTGGCGGA TGGGTTGATG TTCATAAAAA TATTACGCTT 168 0 

GGTAGCGGCT TTTTAAACAT CACAACTAAA GAAGGAGATA TCGCCTTCGA AGACAAGTCT' 1740 

GGACGGAACA ACCTAACCAT TACAGCCCAA GGGACCATCA CCTCAGGTAA TAGTAACGGC 1800 

TTTAGATTTA ACAACGTCTC TCTAAACAGC CTTGGCGGAA AGCTGAGCTT TACTGACAGC 1860 

AGAGAGGACA GAGGTAGAAG AACTAAGGGT AATATCTCAA ACAAATTTGA CGGAACGTTA 1920 

AACATTTCCG GAACTGTAGA TATCTCAATG AAAGCACCCA AAGTCAGCTG GTTTTACAGA 1980 

GACAAAGGAC GCACCTACTG GAACGTAACC ACTTTAAATG TTACCTCGGG TAGTAAATTT 2040 

AACCTCTCCA TTGACAGCAC AGGAAGTGGC TCAACAGGTC CAAGCATACG CAATGCAGAA 2100 

TTAAATGGCA TAACATTTAA TAAAGCCACT TTTAATATCG CACAAGGCTC AACAGCTAAC 2160 

TTTAGCATCA AGGCATCAAT AATGCCCTTT AAGAGTAACG CTAACTACGC ATTATTTAAT 2220 

GAAGATATTT CAGTCTCAGG GGGGGGTAGC CTTAATTTCA AACTTAACGC CTCATCTAGC 2280 

AACATACAAA CCCCTGGCGT AATTATAAAA TCTCAAAACT TTAATGTCTC AGGAGGGTCA 2340 

ACTTTAAATG TCAAGGCTGA AGGTTCAACA GAAACCGCTT TTTCAATAGA AAATGATTTA 2400 

AACTTAAACG CCACCGGTGG CAATATAACA ATCAGACAAG TCGAGGGTAC CGATTCACGC 2460 
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GTCAACAAAG GTGTCGCAGC CAAAAAAAAC ATAACTTTTA AAGGGGGTAA TATCACCTTC 2520 

GGCTCTCAAA AAGCCACAAC AGAAATCAAA GGCAATGTTA CCATCAATAA AAACACTAAC 2 580 

GCTACTCTTT GTGGTGCGAA TTTTGCCGAA AA.CA.:^TCGC CTTTAAATAT AGCAGGAAAT 2640 

GTTATTAATA ATGGCAACCT TACCACTGCC GGCTCCATTA TCAATATAGC CGGAAATCTT 2 700 

ACTGTTTCAA AAGGCGCTAA CCTTCAAGCT ATAACAAATT ACACTTTTAA TGTAGCCGGC 276 0 

TCATTTGACA ACAATGGCGC TTCAAACATT TCCATTGCCA GAGGAGGGGC TAAATTTAAA 2820 

GATATCAATA ACACCAGTAG CTTAAATATT ACCACCAACT CTGATACCAC TTACCGCACC 2880 

ATTATAAAAG GCAATATATC CAACAAATCA GGTGATTTGA ATATTATTGA TAAAAAAAGC 2940 

GACGCTGAAA TCCAAATTGG CGGCAATATC TCACAAAAAG AAGGCAATCT CACAATTTCT 3000 

TCTGATAAAG TAAATATTAC CAATCAGATA ACAATCAAAG CAGGCGTTGA AGGGGGGCGT 3 060 

TCTGATTCAA GTGAGGCAGA AAATGCTAAC CTAACTATTC AAACCAAAGA GTTAAAATTG 3120 

GCAGGAGACC TAAATATTTC AGGCTTTAAT AAAGCAGAAA TTACAGCTAA AAATGGCAGT 3180 

GATTTAACTA TTGGCAATGC TAGCGGTGGT AATGCTGATG CTAAAAAAGT GACTTTTGAC 3240 

AAGGTTAAAG ATTCAAAAAT CTCGACTGAC GGTCACAATG TAACACTAAA TAGCGAAGTG 3300 

AAAACGTCTA ATGGTAGTAG CAATGCTGGT AATGATAACA GCACCGGTTT AACCATTTCC 3 36 0 

GCAAAAGATG TAACGGTAAA CAATAACGTT ACCTCCCACA AGACAATAAA TATCTCTGCC 342 0 

GCAGCAGGAA ATGTAACAAC CAAAGAAGGC ACAACTATCA ATGCAACCAC AGGCAGCGTG 3 48 0 

GAAGTAACTG CTCAAAATGG TACAATTAAA GGCAACATTA CCTCGCAAAA TGTAACAGTG 3540 

ACAGCAACAG AAAATCTTGT TACCACAGAG AATGCTGTCA TTAATGCAAC CAGCGGCACA 3600 

GTAAACATTA GTACAAAAAC AGGGGATATT AAAGGTGGAA TTGAATCAAC TTCCGGTAAT 3660 

GTAAATATTA CAGCGAGCGG CAATACACTT AAGGTAAGTA ATATCACTGG TCAAGATGTA 3720 

ACAGTAACAG CGGATGCAGG AGCCTTGACA ACTACAGCAG GCTCAACCAT TAGTGCGACA- 3780 

ACAGGCAATG CAAATATTAC AACCAAAACA GGTGATATCA ACGGTAAAGT TGAATCCAGC 3840 

TCCGGCTCTG TAACACTTGT TGCAACTGGA GCAACTCTTG CTGTAGGTAA TATTTCAGGT 3900 

AACACTGTTA CTATTACTGC SGATAGCGGT AAATTAACCT CCACAGTAGG TTCTACAATT 3960 

AATGGGACTA ATAGTGTAAC CACCTCAAGC CAATCAGGCG ATATTGAAGG TACAATTTCT 4020 

GGTAATACAG TAAATGTTAC AGCAAGCACT GGTGATTTAA CTATTGGAAA TAGTGCAAAA 4 080 

GTTGAAGCGA AAAATGGAGC TGCAACCTTA ACTGCTGAAT CAGGCAAATT AACCACCCAA 4140 

ACAGGCTCTA GCATTACCTC AAGCAATGGT CAGACAACTC TTACAGCCAA GGATAGCAGT 4200 

ATCGCAGGAA ACATTAATGC TGCTAATGTG ACGTTAAATA CCACAGGCAC TTTAACTACT 4260 

ACAGGGGATT CAAAGATTAA CGCAACCAGT GGTACCTTAA CAATCAATGC AAAAGATGCC 4320 

AAATTAGATG GTGCTGCATC AGGTGACCGC ACAGTAGTAA ATGCAACTAA CGCAAGTGGC 4380 

TCTGGTAACG TGACTGCGAA AACCTCAAGC AGCGTGAATA TCACCX3GGGA TTTAAACACA 4440 

ATAAATGGGT TAAATATCAT TTCGGAAAAT GGTAGAAACA CTGTGCGCTT AAGAGGCAAG 4500 



90 

GAAATTGATG TGAAATATAT CCAACCAGGT GTAGCAAGCG TAGAAGAGGT AATTGAAGCG 4 56 0 

AAACGCGTCC TTGAGAAGGT AAAAGATTTA TCTGATGAAG AAAGAGAAAC ACTAGCCAAA 4 620 

CTTGGTGTAA GTGCTGTACG TTTCGTTGAG CCAAATAATG CCATTACGGT TAATACACAA ■ 4680 

AACGAGTTTA CAACCAAACC ATCAAGTCAA GTGACAATTT CTGAAGGTAA GGCGTGTTTC 4 74 0 

TCAAGTGGTA ATGGCGCACG AGTATGTACC AATGTTGCTG ACGATGGACA GCAG 4 7 94 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4803 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 8: 







CAAATTCAGC 


AAACGCCTGA 


ATGCTTTGGT 


TGCTGTGTCT 


60 






CCATTCCACA 


GAAAAAGGCA GTGAAAAACC 


TGTTCGTACG 


120 


AAAGTACGCC 


ACTTGGCGTT 


AAAGCCACTT 


TCCGCTATAT 


TGCTATCTTT 


GGGCATGGCA 


180 


TCCATTCCGC 


AATCTGTTTT 


AGCGAGCGGT 


TTACAGGGAA 


TGAGCGTCGT 


ACACGGTACA 


240 


GCAACCATGC 


AAGTAGACGG 


CAATAAAACC 


ACTATCCGTA 


ATAGCGTCAA 


TGCTATCATC 


300 


AATTGGAAAC 


AATTTAACAT 


TGACCAAAAT 


GAAATGGTGC 


AGTTTTTACA 


AGAAAGCAGC 


360 


AACTCTGCCG 


TTTTCAACCG 


TGTTACATCT 


GACCAAATCT 


CCCAATTAAA 


AGGGATTTTA 


420 


GATTCTAACG 


GACAAGTCTT 


TTTAATCAAC 


CCAAATGGTA 


TCACAATAGG 


TAAAGACGCA 


480 


ATTATTAACA 


CTAATGGCTT 


TACTGCTTCT 


ACGCTAGACA 


TTTCTAACX3A 


AAACATCAAG 


540 


GCGCGTAATT 


TCACCCTTGA 


GCAAACCAAG 


GATAAAGCAC 


TCGCTGAAAT 


CGTGAATCAC 


600 


GGTTTAATTA 


CCGTTGGTAA 


AGACGGTAGC 


GTAAACCTTA 


TTGGTGGCAA 


AGTGAAAAAC 


660 


GAGGGCGTGA 


TTAGCGTAAA 


TGGCGGTAGT 


ATTTCTTTAC 


TTGCAGGGCA 


AAAAATCACC 


720 


ATCAGCGATA 


TAATAAATCC 


AACCATCACT 


TACAGCATTG 


CTGCACCTGA 


AAACGAAGCG 


780 


ATCAATCTGG 


GCGATATTTT 


TGCCAAAGGT 


GGTAACATTA ATGTCCGCGC 


TGCCACTATT 


840 


CGCAATAAAG 


GTAAACTTTC 


TGCCGACTCT 


GTAAGCAAAG ATAAAAGTGG 


TAACATTGTT 


900 


CTCTCTGCCA 


AAGAAGGTGA 


AGCGGAAATT 


GGCGGTGTAA 


TTTCCGCTCA 


AAATCAGCAA 


960 


GCCAAAGGTG 


GTAAGTTGAT 


GATTACAGGT 


GATAAAGTCA CATTAAAAAC 


AGGTGCAGTT 


1020 


ATCGACCTTT 


CAGGTAAAGA 


AGGGGGAGAG 


ACTTATCTTG 


GCGGTGATGA 


GCGTGGCGAA 


1080 


GGTAAAAATG 


GTATTCAATT 


AGCGAAGAAA 


ACCTCTTTAG 


AAAAAGGCTC 


GACAATTAAT 


1140 


GTATCAGGCA AAGAAAAAGG 


CGGGCGCGCT 


ATTGTATGGG 


GCGATATTGC 


ATTAATTAAT 


1200 


GGTAACATTA ATGCTCAAGG 


TAGCGATATT 


GCTAAAACTG 


GCGGCTTTGT 


GGAAACATCA 


1260 




ISO i.CI3iOS02 
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GGACATGACT TATCCATTGG TGATGATGTG ATTGTTGACG CTAAAGAGTG GTTATTAGAC 
CCAGATGATG TGTCCATTGA AACTCTTACA TCTGGACGCA ATAATACCGG CGAAAACCAA 
GGATATACAA CAGGAGATGG GACTAAAGAG TCACCTAAAG GTAATAGTAT TTCTAAACCT 
ACATTAACAA ACTCAACTCT TGAGCAAATC CTAAGAAGAG GTTCTTATGT TAATATCACT 
GCTAATAATA GAATTTATGT TAATAGCTCC ATCAACTTAT CTAATGGCAG TTTAACACTT 
CACACTAAAC GAGATGGAGT TAAAATTAAC GGTGATATTA CCTCAAACGA AAATGGTAAT 
TTAACCATTA AAGCAGGCTC TTGGGTTGAT GTTCATAAAA ACATCACGCT TGGTACGGGT 
TTTTTGAATA TTGTCGCTGG GGATTCTGTA GCTTTTGAGA GAGAGGGCGA TAAAGCACGT 
AACGCAACAG ATGCTCAAAT TACCGCACAA GGGACGATAA CCGTCAATAA AGATGATAAA 
CAATTTAGAT TCAATAATGT ATCTATTAAC GGGACGGGCA AGGGTTTAAA GTTTATTGCA 
AATCAAAATA ATTTCACTCA TAAATTTGAT GGCGAAATTA ACATATCTGG AATAGTAACA 
ATTAACCAAA CCACGAAAAA AGATGTTAAA TACTGGAATG CATCAAAAGA CTCTTACTGG 
AATGTTTCTT CTCTTACTTT GAATACGGTG CAAAAATTTA CCTTTATAAA ATTCGTTGAT 
AGCGGCTCAA ATTCCCAAGA TTTGAGGTCA TCACGTAGAA GTTTTGCAGG CGTACATTTT 
AACGGCATCG GAGGCAAAAC AAACTTCAAC ATCGGAGCTA ACGCAAAAGC CTTATTTAAA 
TTAAAACCAA ACGCCGCTAC AGACCCAAAA AAAGAATTAC CTATTACTTT TAACGCCAAC 
ATTACAGCTA CCGGTAACAG TGATAGCTCT GTGATGTTTG ACATACACGC CAATCTTACC 
TCTAGAGCTG CCGGCATAAA CATGGATTCA ATTAACATTA CCGGCGGGCT TGACTTTTCC 
ATAACATCCC ATAATCGCAA TAGTAATGCT TTTGAAATCA AAAAAGACTT AACTATAAAT 
GCAACTGGCT CGAATTTTAG TCTTAAGCAA ACGAAAGATT CTTTTTATAA TGAATACAGC 
AAACACGCCA TTAACTCAAG TCATAATCTA ACCATTCTTG GCGGCAATGT CACTCTAGGT 
GGGGAAAATT CAAGCAGTAG CATTACGGGC AATATCAATA TCACCAATAA AGCAAATGTT 
ACATTACAAG CTGACACCAG CAACAGCAAC ACAGGCTTGA AGAAAAGAAC TCTAACTCTT 
GGCAATATAT CTGTTGAGGG GAATTTAAGC CTAACTGGTG CAAATGCAAA CATTGTCGGC 
AATCTTTCTA TTGCAGAAGA TTCCACATTT AAAGGAGAAG CCAGTGACAA CCTAAACATC 
ACCGGCACCTTTACCAACAA CGGTACCGCC AACATTAATA TAAAACAAGG AGTGGTAAAA 
CTCCAAGGCG ATATTATCAA TAAAGGTGGT TTAAATATCA CTACTAACGC CTCAGGCACT 
CAAAAAACCA tTATTAACGG AAATATAACT AACGAAAAAG GCGACTTAAA CATCAAGAAT 
ATTAAAGCCG ACGCCGAAAT CCAAATTGGC GGCAATATCT CACAAAAAGA AGGCAATCTC 
ACAATTTCTT CTGATAAAGT AAATATTACC AATCAGATAA CAATCAAAGC AGGCGTTGAA 
GGGGGGCGTT CTGATTCAAG TGAGGCAGAA AATGCTAACC TAACTATTCA AACCAAAGAG 
TTAAAATTGG CAGGAGACCT AAATATTTCA GGCTTTAATA AAGCAGAAAT TACAGCTAAA 
AATGGCAGTG ATTTAACTAT TGGCAATGCT AGCGGTGGTA ATGCTGATGC TAAAAAAGTG 
ACTTTTGACA AGGTTAAAGA TTCAAAAATC TCGACTGACG GTCACAATGT AACACTAAAT 



1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 
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AGCGAAGTGA AAACGTCTAA TGGTAGTAGC AATGil I'GGTA ATGATAACAG CACCGGTTTA 3 3 SO 

ACCATTTCCG CAAAAGATGT AACGGTAAAC AATA.'.....-GTTA CCTCCCACAA GACAATAAAT • 34 20 

ATCTCTGCCG CAGCAGGAAA TGTAACAACC AAAGAAGGi:A CAACTATCAA TGCAACCACA 34 80 

GGCAGCGTGG AAGTAACTGC TCAAAATGGT ACAATTAAAG GCAACATTAC CTCGCAAAAT 3 54 0 

GTAACAGTGA CAGCAACAGA AAATCTTGTT ACCACAGAGA ATGCTGTCAT TAATGCAACC 3 6 00 

AGCGGCACAG TAAACATTAG TACAAAAACA GGGGATATTA AAGGTGGAAT TGAATCAACT 366 0 

TCCGGTAATG TAAATATTAC AGCGAGCGGC AATACACTTA AGGTAAGTAA TATCACTGGT 3720 

CAAGATGTAA CAGTAACAGC GGATGCAGGA GCCTTGACAA CTACAGCAGG CTCAACCATT 3 78 0 

AGTGCGACAA CAGGCAATGC AAATATTACA ACCAAAACAG GTGATATCAA CGGTAAAGTT 3 84 0 

GAATCCAGCT CCGGCTCTGT AACACTTGTT GCAACTGGAG CAACTCTTGC TGTAGGTAAT 3 900 

ATTTCAGGTA ACACTGTTAC TATTACTGCG GATAGCGGTA AATTAACCTC CACAGTAGGT 3 960 

TCTACAATTA ATGGGACTAA TAGTGTAACC ACCTCAAGCC AATCAGGCGA TATTGAAGGT 4020 

ACAATTTCTG GTAATACAGT AAATGTTACA GCAAGCACTG GTGATTTAAC TATTGGAAAT 4080 

AGTGCAAAAG TTGAAGCGAA AAATGGAGCT GCAACCTTAA CTGCTGAATC AGGCAAATTA 4140 

ACCACCCAAA CAGGCTCTAG CATTACCTCA AGCAATGGTC AGACAACTCT TACAGCCAAG 42 00 

GATAGCAGTA TCGCAGGAAA CATTAATGCT GCTAATGTGA CGTTAAATAC CACAGGCACT 42 GO 

TTAACTACTA CAGGGGATTC AAAGATTAAC GCAACCAGTG GTACCTTAAC AATCAATGCA 43 2 0 

AAAGATGCCA AATTAGATGG TGCTGCATCA GGTGACCGCA CAGTAGTAAA TGCAACTAAC 4 380 

GCAAGTGGCT CTGGTAACGT GACTGCGAAA ACCTCAAGCA GCGTGAATAT CACCGGGGAT 4440 

TTAAACACAA TAAATGGGTT AAATATCATT TCGGAAAATG GTAGAAACAC TGTGCGCTTA 4 500 

AGAGGCAAGG AAATTGATGT GAAATATATC CAACCAGGTG TAGCAAGCGT AGAAGAGGTA -4 560 

ATTGAAGCGA AACGCGTCCT TGAGAAGGTA AAAGATTTAT CTGATGAAGA AAGAGAAACA 4620 

CTAGCCAAAC TTGGTGTAAG TGCTGTACGT TTCGTTGAGC CAAATAATGC CATTACGGTT 4 680 

AATACACAAA ACGAGTTTAC AACCAAACCA TCAAGTCAAG TGACAATTTC TGAAGGTAAG 4 740 

GCGTGTTTCT CAAGTGGTAA TGGCGCACGA GTATGTACCA ATGTTGCTGA CGATGGACAG 4800 

CAG 4803 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1599 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Asn Lys lie Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 
15 10 15 

Val Ala Val Ser Glu Leu Thr Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

Gly Ser Glu Lys Pro Val Arg Thr Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala lie Leu Leu Ser Leu Gly Met Ala Ser lie Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Ser Val Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr Thr He Arg Asn Ser Val 
85 90 95 

Asn Ala lie He Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 110 

Glu Gin Phe Leu Gin Glu Ser Ser Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 

Thr Ser Asp Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 
130 135 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
145 150 1-55 160 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 
165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Leu Glu Gin Thr Lys Asp Lys 
180 185 190 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 240 

He Ser Asp He lie Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala He Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Lys Gly Lys Leu Ser Ala 
275 280 285 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
290 295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
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330 
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Thr Gly Ala Val lie Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 
340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly lie Gin Leu Ala 

355 360 365 

Lys Lys Thr Thr Leu Glu Lys Gly Ser Thr lie Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Arg Ala lie Val Trp Gly Asp lie Ala Leu lie Asp 
385 390 395 400 

Gly Asn lie Asn Ala Gin Gly Lys Asp lie Ala Lys Thr Gly Gly Phe 
405 410 415 

Val Glu Thr Ser Gly His Tyr Leu Ser lie Asp Asp Asn Ala lie Val 
420 425 430 

Lys Thr Lys Glu Trp Leu Leu Asp Pro Glu Asn Val Thr lie Glu Ala 
435 440 445 

Pro Ser Ala Ser Arg Val Glu Leu Gly Ala Asp Arg Asn Ser His Ser 
450 455 460 

Ala Glu Val lie Lys Val Thr Leu Lys Lys Asn Asn Thr Ser Leu Thr 
465 470 475 480 

Thr Leu Thr Asn Thr Thr lie Ser Asn Leu Leu Lys Ser Ala His Val 
485 490 495 

Val Asn lie Thr Ala Arg Arg Lys Leu Thr Val Asn Ser Ser lie Ser 
500 505 510 

lie Glu Arg Gly Ser His Leu lie Leu His Ser Glu Gly Gin Gly Gly 
515 520 525 

Gin Gly Val Gin lie Asp Lys Asp lie Thr Ser Glu Gly Gly Asn Leu 
530 535 540 

Thr lie Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn lie Thr Leu 
545 550 555 560 

Gly Ser Gly Phe Leu Asn lie Thr Thr Lys Glu Gly Asp He Ala Phe 
565 570 575 

Glu Asp Lys Ser Gly Arg Asn Asn Leu Thr He Thr Ala Gin Gly Thr 
580 _ 585 590 

He Thr Ser Gly Asn Ser Asn Gly Phe Arg Phe Asn Asn Val Ser Leu 
595 600 605 

Asn Ser Leu Gly Gly Lys Leu Ser Phe Thr Asp Ser Arg Glu Asp Arg 
610 615 620 

Gly Arg Arg Thr Lys Gly Asn He Ser Asn Lys Phe Asp Gly Thr Leu 
625 630 635 640 

Asn He Ser Gly Thr Val Asp He Ser Met Lys Ala Pro Lys Val Ser 
645 650 65S 

Trp Phe Tyr Arg Asp Lys Gly Arg Thr Tyr Trp Asn Val Thr Thr Leu 
660 665 670 

Asn Val Thr Ser Gly Ser Lys Phe Asn Leu Ser He Asp Ser Thr Gly 
675 680 685 



ser Gly Ser Thr Gly Pro Ser He Arg Asn Ala Glu Leu Asn Gly He 
690 695 -700 

Thr Phe Asn Lys Ala Thr Phe Asn lie Ala Gin Gly Ser Thr Ala Asn 
710 715 

Phe Ser He Lys Ala Ser He Met Pro Phe Lys Ser Asn Ala Asn Tyr 
725 730 735 

Ala Leu Phe Asn Glu Asp He Ser Val Ser Gly Gly Gly Ser Val Asn 
740 745 750 

Phe Lys Leu Asn Ala Ser Ser Ser Asn He Gin Thr Pro Gly Val He 
755 760 765 

He Lys Ser Gin Asn Phe Asn Val Ser Gly Gly Ser Thr Leu Asn Leu 
770. 775 780 

Lys Ala Glu Gly Ser Thr Glu Thr Ala Phe Ser He Glu Asn Asp Leu 

795 800 
Asn Leu Asn Ala Thr Gly Gly Asn He Thr He Arg Gin Val Glu Glv 
805 810 815 

Thr Asp Ser Arg Val Asn Lys Gly Val Ala Ala Lys Lys Asn He Thr 
820 825 830 

Phe Lys Gly Gly Asn He Thr Phe Gly Ser Gin Lys Ala Thr Thr Glu 
835 840 845 

He Lys Gly Asn Val Thr He Asn Lys Asn Thr Asn Ala Thr Leu Arg 
850 855 860 

Gly Ala Asn Phe Ala Glu Asn Lys Ser Pro Leu Asn He Ala Gly Asn 
870 875 880 

Val He Asn Asn Gly Asn Leu Thr Thr Ala Gly Ser He He Asn He 
885 890 895 

Ala Gly Asn Leu Thr Val Ser Lys Gly Ala Asn Leu Gin Ala He Thr 
900 905 910 

Asn Tyr Thr Phe Asn Val Ala Gly Ser Phe Asp Asn Asn Gly Ala Ser 
915 920 925 

Asn He Ser He Ala Arg Gly Gly Ala Lys Phe Lys Asp He Asn Asn 
930 935 

Thr Ser Ser Leu ASn He Thr Thr Asn Ser Asp Thr Thr Tyr Arq Thr 
5^*5 950 955 ^ 

He He Lys Gly Asn He Ser Asn Lys Ser Gly Asp Leu Asn He lie 
965 970 975 

Asp Lys Lys Ser Asp Ala Glu He Gin He Gly Gly Asn He Ser Gin 
980 985 990 

Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys Val Asn He Thr Asn 
995 1000 1005 

Gin He Thr He Lys Ala Gly Val Glu Gly Gly Arg Ser Asp Ser Ser 
1010 1015 1020 

Glu Ala Glu Asn Ala Asn Leu Thr He Gin Thr Lys Glu Leu Lvs Leu 

1025 iftin int^ ^ 
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Ala Gly Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu lie Thr Ala 

1045 1050 1055. 

Lys Asn Gly Ser Asp Leu Thr He Gly Asn Ala Ser Gly Gly Asn Ala 



1070 



Asp Ala Lys Lys Val Thr Phe Asp Lys Val Lys Asp Ser Lys He Ser 
1075 1080 1085 

"^^"^ ^^L^^^ -^^^ Ser Glu Val Lys Thr Ser Asn 

1090 1095 1^00 

Gly Ser Ser Asn Ala Gly Asn Asp Asn Ser Thr Gly Leu Thr lie Ser 
1110 1115 1120 

Ala Lys Asp Val Thr Val Asn Asn Asn Val Thr Ser His Lys Thr He 
1125 1130 1135 

Asn He Ser Ala Ala Ala Gly Asn Val Thr Thr Lys Glu Gly Thr Thr 
1140 1145 1150 

He Asn Ala Thr Thr Gly Ser Val Glu Val Thr Ala Gin Asn Gly Thr 
1155 1160 1165 

i^Hn^^^ '^^'^ ^^"^ ^^"^ Ala Thr Glu 

I'O 1175 1180 

Asn Leu Val Thr Thr Glu Asn Ala Val He Asn Ala Thr Ser Gly Thr 
1185 1190 1195 ^200 

val Asn He Ser Thr Lys Thr Gly Asp He Lys Gly Gly He Glu Ser 
1205 1210 1215 

Thr Ser Gly Asn Val Asn He Thr Ala Ser Gly Asn Thr Leu Lys Val 
1220 1225 1230 

Ser Asn lie Thr Gly Gin Asp Val Thr Val . Thr Ala Asp Ala Gly Ala 
1235 1240 1245 

Leu Thr Thr Thr Ala Gly Ser Thr He Ser Ala Thr Thr Gly Asn Al-a 
1250 1255 1260 

Asn He Thr Thr Lys Thr Gly Asp He Asn Gly Lys Val Glu Ser Ser 
1265 1270 1275 i280 

Ser Gly Ser Val Thr Leu Val Ala Thr Gly Ala Thr Leu Ala Val Gly 
1285 1290 1295 

Asn He Ser Gly Asn Thr Val Thr He Thr Ala Asp Ser Gly Lys Leu 
1300 1305 1310 

Thr Ser Thr Val Gly Ser Thr He Asn Gly Thr Asn Ser Val Thr Thr 
1315 1320 1325 

"^"^ Ser Gly Asn Thr Val 
1330 1335 1340 

Asn val Thr Ala Ser Thr Gly Asp Leu Thr He Gly Asn Ser Ala Lys 
^^'^S 1350 1355 1360 

Val Glu Ala Lys Asn Gly Ala Ala Thr Leu Thr Ala Glu Ser Gly Lvs 
1365 1370 1375 

Len Thr Thr Gin Thr Gly Ser Ser He Thr Ser Ser Asn Gly Gin Thr 
1380 1385 1390 



Thr Leu Thr Ala Lys Asp Ser Ser lie Ala Gly Asn He Asn Ala Ala 
1395 1400 1405 

Asn Val Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Thr Gly Asp Ser 
1410 1415 1420 

Lys He Asn Ala Thr Ser Gly Thr Leu Thr He Asn Ala Lys Asp Ala 
1425 1430 1435 1440 

Lys Leu Asp Gly Ala Ala Ser Gly Asp Arg Thr Val Val Asn Ala Thr 
1445 1450 1455 

Asn Ala Ser Gly Ser Gly Asn Val Thr Ala Lys Thr Ser Ser Ser Val 
1460 1465 ' 1470 

Asn He Thr Gly Asp Leu Asn Thr He Asn Gly Leu Asn He He Ser 
1475 1480 1485 

Glu Asn Gly Arg Asn Thr Val Arg Leu Arg Gly Lys Glu He Asp Val 
1490 1495 1500 

Lys Tyr He Gin Pro Gly Val Ala Ser Val Glu Glu Val He Glu Ala 
1505 1510 1515 1520 

Lys Arg Val Leu Glu Lys Val Lys -Asp Leu Ser Asp Glu Glu Arg Glu 
1525 1530 1535 

Thr Leu Ala Lys Leu Gly Val Ser Ala Val Arg Phe Val Glu Pro Asn 
1S40 1545 1550 

Asn Ala He Thr Val Asn Thr Gin Asn Glu Phe Thr Thr Lys Pro Ser 
1555 1560 1565 

Ser Gin Val Thr He Ser Glu Gly Lys Ala Cys Phe Ser Ser Gly Asn 
1570 1575 1580 

Gly Ala Arg Val Cys Thr Asn Val Ala Asp Asp Gly Gin Gin Pro 
1585 1590 1595 

INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1600 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asn Lys He Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 
1 5 10 15 

Val Ala Val Ser Glu Leu Thr Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

Gly Ser Glu Lys Pro Val Arg Thr Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala He Leu Leu Ser Leu Gly Met Ala Ser He Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Ser Val Val His Gly Thr 
65 70 75 80 



T., „e, „.i 

zae rxe Asn T.p ^^^^ 
01. =1„ P.. oi„ 01„ se. s.. 1° 
Th. sej .sp Oln XU se. oln 

=ln val u. p„ ^^^^ 

ue «3„ .h. .3„ Cl, P.. 2 
=lu .s„ Tie .,3 

Ala o:u v.i „j3 01, .eu Th. Val ,,3 .3p 

Oly ser val A3n He Oly oly .y. val Lys .s„ oly v,l xle 

S.. V31 .3„ =Xy Oly Se. XXe s„ cxy .y^ xle 

XI. se, .3p Xle XU ..n P„ ..r xle TK. xyr Se. xXe .le .1. p" 

255 

Glu Asn Glu Ala He Asn Leu Glv Asd T\a vh^ T.^ r 

260 ^ ^-^^ Ala Lys Gly Gly Asn 

270 

Xle .3„ val ..3 Ala «a Th. Xle ^„ .y^ 

ASP ser Val Ser , oly ^„ xle Val se, Ly= 

300 

01. Oly Cl„ Ma Olu Xle oly oly Val Xle Se. «a ol„ 01. ai„ 

320 

Ala .y. Oly oxy .y3 .eu Met xle Th. oly ..p .y3 val Th. .y^ 

Thr Oly «a Val xle Asp .eu Ser cly .ys olu oly oly olu Z ryr 

■^^'^ 350 
Leu Oly Oly ^p olu Ar. Oly olu Oly Ly3 A.n oly xle Oln Leu Ma 

Lys Lys x*r Thr Leu olu Lys Gly Ser Thr xle As„ Val Ser oly Lys 

Glu Lys Oly oly Ma lie Val oly Asp xle Ma Leu Xle Asp 

400 

. Cly As„ Xle Asn Ma Oln oly Ser Asp xle Ma Lys rtr oly oly PHe 
val Olu Th. ser oly His Asp Leu Ser Xle Oly Asp Asp Val xl^ Val 



430 



Asp Ala Lys Glu Trp Leu Leu Asp Pro Asp Asp Val Ser lie Glu Thr 
435 440 445 

Leu Thr Ser Gly Arg Asn Asn Thr Gly Glu Asn Gin Gly Tyr Thr Thr 
450 455 

Gly Asp Gly Thr Lys Glu Ser Pro Lys Gly Asn Ser He Ser Lys Pro 
470 475 480 

Thr Leu Thr Asn Ser Thr Leu Glu Gin He Leu Arg Arg Gly Ser Tyr 
485 490 

Val Asn He Thr Ala Asn Asn Arg He Tyr Val Asn Ser Ser He Asn 
500 505 510 

Leu Ser Asn Gly Ser Leu Thr Leu His Thr Lys Arg Asp Gly Val Lys 
515 520 525 

He Asn Gly Asp He Thr Ser Asn Glu Asn Gly Asn Leu Thr He Lvs 
530 535 540 ^ 

Ala Gly Ser Trp Val Asp Val His Lys Asn He Thr Leu Gly Thr Gly 
550 555 

Phe Leu Asn He Val Ala Gly Asp Ser Val Ala Phe Glu Arg Glu Gly 
565 570 

Asp Lys Ala Arg Asn Ala Thr Asp Ala Gin He Thr Ala Gin Gly Thr 
580 585 590 

He Thr Val Asn Lys Asp Asp Lys Gin Phe Arg Phe Asn Asn Val Ser 
595 600 605 

Leu Asn Gly Thr Gly Lys Gly Leu Lys Phe He Ala Asn Gin Asn Asn 
610 615 620 

Phe Thr His Lys Phe Asp Gly Glu He Asn He Ser Gly He Val Thr 
"5 630 635 - 640 

He Asn Gin Thr Thr Lys Lys Asp Val Lys Tyr Trp Asn Xla Ser Lys. 

645 650 655 

Asp Ser Tyr Trp Asn Val Ser Ser Leu Thr Leu Asn Thr Val Gin Lvs 
660 665 670 

Phe Thr Phe He Lys Phe Val Asp Ser Gly Ser Asn Gly Gin Asp Leu 
675 680 685 

Arg Ser Ser Arg Arg Ser Phe Ala Gly Val His Phe Asn Gly He Glv 
690 695 700 

Gly Lys Thr Asn Phe Asn He Gly Ala Asn Ala Lys Ala Leu Phe Lys 
''OS 710 715 720 

Leu Lys Pro Asn Ala Ala Thr Asp Pro Lys Lys Glu Leu Pro He Thr 
725 730 735 

Phe Asn Ala Asn He Thr Ala Thr Gly Asn Ser Asp Ser Ser Val Met 
740 745 750 

Phe Asp He His Ala Asn Leu Thr Ser Arg Ala Ala Gly He Asn Met 
755 760 765 

Asp Ser He Asn He Thr Gly Gly Leu Asp Phe Ser He Thr Ser His 
770 775 780 



100 

Asn Arg Asn Ser Asn Ala Phe Glu He Lys Lys Asp Leu Thr lie Asn 

Ala Thr Gly Ser Asn Phe Ser Leu Lys Gin Thr Lys Asp Ser Phe Tyr 

810 815 
Asn Glu Tyr Ser Lys His Ala He Asn Ser Ser His Asn Leu Thr He 

825 830 
Leu Gly Gly Asn Val Thr Leu Gly Gly Glu Asn Ser Ser Ser Ser He 

840 845 
Thr Gly Asn He Asn He Thr Asn Lys Ala Asn Val Thr Leu Gin Ala 
655 860 

ASP Thr Ser Asn Ser Asn Thr Gly Leu Lys Lys Arg Thr Leu Thr Leu 

875 880 
Gly Asn He Ser Val Glu Gly Asn Leu Ser Leu Thr Gly Ala Asn Ala 
885 890 895 

Asn He Val Gly Asn Leu Ser He Ala Glu Asp Ser Thr Phe Lys Gly 
900 905 ^ 

Glu Ala ser Asp Asn Leu Asn He Thr Gly Thr Phe Thr Asn Asn Gly 
915 920 925 

Thr Ala Asn He Asn He Lys Gly Val Val Lys Leu Gly Asp He Asn 
935 940 

Asn Lys Gly Gly Leu Asn He Thr Thr Asn Ala Ser Gly Thr Gin Lys 
950 955 .1. 



960 

Thr He He Asn Gly Asn He Thr Asn Glu Lys Gly Asp Leu Asn He 
965 970 

Lys Asn He Lys Ala Asp Ala Glu He Gin He Gly Gly Asn He Ser 



985 



990 



Gin Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys Val Asn He Thr 



1005 



Asn Gin He Thr He Lys Ala Gly Val Glu Gly Gly Arg Ser Asp Ser 

1015 1020 
ser Glu Ala Glu Asn Ala Asn Leu Thr He Gin Thr Lys Glu Leu Lys 

1030 1035 1040 

Leu Ala Gly Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr 

1045 1050 1055 

Ala Lys Asn Gly Ser Asp Leu Thr He Gly Asn Ala Ser Gly Gly Asn 
1065 1070 

Ala Asp Ala Lys Lys Val Thr Phe Asp Lys Val Lys Asp Ser Lys He 
1075 loeo 1085 

^^"^ Tno/^^ ■^'^ Asn Ser Glu Val Lys Thr Ser 

1095 1100 

. Asn Gly ser Ser Asn Ala Gly Asn Asp Asn Ser Thr Gly Leu Thr He 
^ 1110 1115 1120 

Ser Ala Lys Asp Val Thr Val Asn Asn Asn Val Thr Ser His Lys Thr 
1125 1130 1135 



dLOOSHgSCI .030S02 



101 

He Asn He Ser Ala Ala Ala Gly Asn Val Thr Thr Lys Glu Gly Thr 
1140 1145 1150 • 

Thr lie Asn Ala Thr Thr Gly Ser Val Glu Val Thr Ala Gin Asn Gly 
1155 1160 1165 ^ 

Thr lie Lys Gly Asn He Thr Ser Gin Asn Val Thr Val Thr Ala Thr 
■^■^ " 1175 1180 

Glu Asn Leu Val Thr Thr Glu Asn Ala Val lie Asn Ala Thr Ser Gly 
^^^^ 1190 1195 1200 

Thr Val Asn He Ser Thr Lys Thr Gly Asp He Lys Gly Gly lie Glu 
1205 1210 1215 

Ser Thr Ser Gly Asn Val Asn He Thr Ala Ser Gly Asn Thr Leu Lys 
1220 1225 1230 

val Ser Asn lie Thr Gly Gin Asp Val Thr Val Thr Ala Asp Ala Gly 
1235 1240 1245 

Ala Leu Thr Thr Thr Ala Gly Ser Thr lie Ser Ala Thr Thr Gly Asn 
^^^^ 1255 1260 

Ala Asn He Thr Thr Lys Thr Gly Asp He Asn Gly Lys Val Glu Ser 
1270 1275 1280 

ser ser Gly Ser Val Thr Leu Val Ala Thr Gly Ala Thr Leu Ala Val 
1285 1290 1295 

Gly Asn He Ser Gly Asn Thr Val Thr He Thr Ala Asp Ser Gly Lys 
1300 1305 1310 ^ 

Leu Thr Ser Thr Val Gly Ser Thr He Asn Gly Thr Asn Ser Val Thr 
1315 1320 132S 

Thr Ser ser Gin Ser Gly Asp He Glu Gly Thr He Ser Gly Asn Thr 
1330 1335 j^3^Q 

val Asn Val Thr Ala Ser Thr Gly Asp Leu Thr He Gly Asn Ser Ala 
^^^^ 1350 1355 1360 

Lys Val Glu Ala Lys Asn Gly Ala Ala Thr Leu Thr Ala Glu Ser Gly 
1365 1370 ^275 

Lys Leu Thr Thr Gin Thr Gly Ser Ser He Thr Ser Ser Asn Gly Gin 
1380 1385 1390 

Thr Thr Leu Thr Aia Lys Asp Ser Ser He Ala Gly Asn He Asn Ala 
1395 1400 1405 

Ala Asn val Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Thr Gly Asp 
I'ilO 1415 1420 

Ser Lys He Asn Ala Thr Ser Gly Thr Leu Thr He Asn Ala Lys Asp 
1^25 1430 1435 1440 

Ala Lys Leu Asp Gly Ala Ala Ser Gly Asp Arg Thr Val Val Asn Ala 
1445 1450 1455 

Thr Asn Ala Ser Gly Ser Gly Asn Val Thr Ala Lys Thr Ser Ser Ser 
1460 1465 147Q 

Val Asn He Thr Gly Asp Leu Asn Thr He Asn Gly Leu Asn He He 
1475 1480 1485 



102 

=1^^AS„ 01, A,, ™ 

^^^^ 1500 ^ 

VaJ^L,. ryr Ue cl„ P.o^ol, v.l Se. v.l^olu olu Ue cl„^ 

Al. .ys v.i .eu^=l„ .,3 . 3e. 17 

Glu Thr Leu Ala Lys Leu Gly Val Ser Al;, « 

154 0 ^ fffc Glu Pro 

•^^^^ 1550 
A=„ As„ Ma^na ™. val .k.^,i„ ^„ ^^^^^^^ 

.e. ser^.ln Val Th. xle Ser 01. 01, cys .Tse. Se. oly 

^^^^ 1580 ^ 

Asn^Oly Ala .xg Val Cys^Thr A3„ Val Ma A.p Asp oly 01„ 01„ P„ 

^5^5 1600 

2) INFORMATION FOR SEQ ID NO: 11: 

fi) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : sinale 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

val ASP oiu val lie Glu Ala Lys Arg lie Leu Olu Lys Val Lys Asp 
Leu ser Asp Glu Glu Arg Glu Ala Leu Ala Lys Leu Gly 



SEQUENCE LISTING 

<110> Barenkamp, Stephen J. 

<120> HIGH MOLECULAR WEIGHT SURFACE PROTEINS OF NON-T-YPEABLE 
HAEMOPHILUS 



<140> 
<141> 

cl50> 09/155,614 
<151> 1998-09-30 

<150> 08/617,697 
<151> 1996-04-01 

<150> PCT/US97/04707 
<151> 1997-04-01 

<160> 11 

<170> Patentin Ver. 2.1 

<210> 1 
<211> 5116 
<212> DNA 

<213> Haemophilus influenzae 

<400> 1 

acagcgttct cttaatacta gtacaaaccc acaataaaat atgacaaaca acaattacaa 60 
cacctttttt gcagtctata tgcaaatatt ttaaaaaata gtataaatcc gccatataaa 120 
atggtataat ctttcatctt tcatctttca tctttcatct ttcatctttc atctttcatc 180 
tttcatcttt catctttcat ctttcatctt tcatctttca tctttcatct ttcatctttc 240 
acatgccctg atgaaccgag ggaagggagg gaggggcaag aatgaagagg gagctgaacg 300 
aacgcaaatg ataaagtaat ttaattgttc aactaacctt aggagaaaat atgaacaagc 360 
tatatcgtct caaattcagc aaacgcctga atgctttggt tgctgtgtct gaattggcac 420 
ggggttgtga ccattccaca gaaaaaggca gcgaaaaacc tgctcgcatg aaagtgcgtc 480 
acttagcgtt aaagccactt tccgctatgt tactatcttt aggtgtaaca tctattccac 540 
aatctgtttt agcaagcggc ttacaaggaa tggatgtagt acacggcaca gccactatgc 600 
aagtagatgg taataaaacc attatccgca acagtgttga cgatatcatt aattggaaac 660 
aatttaacat cgaccaaaat gaaatggtgc agtttttaca agaaaacaac aactccgccg 720 
tattcaaccg tgttacatct aaccaaatct cccaattaaa agggatttta gattctaacg 780 
gacaagtctt tttaatcaac ccaaatggta tcacaatagg taaagacgca attattaaca 840 
ctaatggctt tacggcttct acgctagaca tttctaacga aaacatcaag gcgcgtaatt 900 
tcaccttcga gcaaaccaaa gataaagcgc tcgctgaaat tgtgaatcac ggtttaatta 960 
ctgtcggtaa agacggcagt gtaaatctta ttggtggcaa agtgaaaaac gagggtgtga 1020 
ttagcgtaaa tggtggcagc atttctttac tcgcagggca aaaaatcacc atcagcgata 1080 
taataaaccc aaccattact tacagcattg ccgcgcctga aaatgaagcg gtcaatctgg 1140 
gcgatatttt tgccaaaggc ggtaacatta atgtccgtgc tgccactatt cgaaaccaag 1200 
gtaaactttc tgctgattct gtaagcaaag ataaaagcgg caatattgtt ctttccgcca 1260 
aagagggtga agcggaaatt ggcggtgtaa tttccgctca aaatcagcaa gctaaaggcg 1320 
gcaagctgat gattacaggc gataaagtca cattaaaaac aggtgcagtt atcgaccttt 1380 
caggtaaaga agggggagaa acttaccttg gcggtgacga gcgcggcgaa ggtaaaaagg 1440 
gcattcaatt agcaaagaaa acctctttag aaaaaggctc aaccatcaat gtatcaggca 1500 
aagaaaaagg cggacgcgct attgtgtggg gcgatattgc gttaattgac ggcaatatta 1560 
acgctcaagg tagtggtgat atcgctaaaa ccggtggttt tgtggagacg tcggggcatg 1620 
atttattcat caaagacaat gcaattgttg acgccaaaga gtggttgtta gacccggata 1680 
atgtatctat taatgcagaa acagcaggac gcagcaatac ttcagaagac gatgaataca 1740 
cgggatccgg gaatagtgcc agcaccccaa aacgaaacaa agaaaagaca acattaacaa 1800 



acacaactct tgagagtata ctaaaaaaag 
gcatctatgt caatagctcc attaatttat 
gtcggagcgg tggcggcgtt gagattaaca 
gtgcaaactt aacaatttac tcaggcggct 
gggcgcaagg taacataaac attacagcta 
accaagtcat tacaggtcaa gggactatta 
ataatgtctc tctaaacggc actggcagcg 
aatacgctat cacaaataaa tttgaaggga 
caatggtttt acctaaaaat gaaagtggat 
atttaacctc cttaaatgtt tccgagagtg 
gaagcgatag tgcaggcaca cttacccagc 
aagacactac ctttaatgtt gaacgaaatg 
tagggataaa taagtattct agtttgaatt 
cgggaggggg gagtgttgat ttcacacttc 
gtgtagttat aaattctaaa tactttaatg 
cttcaggctc aacaaaaact ggcttctcaa 
gaggcaacat aacacttttg caagttgaag 
tagccaaaaa aaacataacc tttgaaggag 
taacagaaat cgaaggcaat gttactatca 
cggattttga caaccatcaa aaacctttaa 
gcaaccttac cgctggaggc aatattgtca 
acgctaattt caaagctatc acaaatttca 
aaggcaattc aaatatttcc attgccaaag 
ccaagaattt aagcatcacc accaactcca 
atataaccaa taaaaacggt gatttaaata 
aaattggcgg cgatgtctcg caaaaagaag 
atattaccaa acagataaca atcaaggcag 
cgacaaacaa tgccaatcta accattaaaa 
atatttcagg tttcaataaa gcagagatta 
gtaacaccaa tagtgctgat ggtactaatg 
attcaaaaat ctctgctgac ggtcacaagg 
gtagtaataa caacactgaa gatagcagtg 
aaaatgtaac agtaaacaac aatattactt 
gtggagaaat taccactaaa acaggtacaa 
taaccgctca aacaggtagt atcctaggtg 
ttactgcaac cgagggcgct cttgctgtaa 
ctgcaaatag cggtgcatta accactttgg 
taaccacttc aagtcaatca ggcgatatcg 
ttaaagcaac cgaaagttta accactcaat 
aggctaacgt aacaagtgca acaggtacaa 
atgttacggc aaacgctggc gatttaacag 
aaggagctgc aaccttaact acatcatcgg 
ttacttcagc caagggtcag gtaaatcttt 
ttaatgccgc caatgtgaca ctaaatacta 
acattaatgc aaccagcggt accttggtta 
cagcattggg taaccacaca gtggtaaatg 
tcgcgacaac ctcaagcaga gtgaacatca 
atatcatttc aaaaaacggt ataaacaccg 
aatacattca accgggtata gcaagcgtag 
agaaggtaaa agatttatct gatgaagaaa 
ctgtacgttt tattgagcca aataatacaa 
ccagaccatt aagtcgaata gtgatttctg 
gcgcgacggt gtgcgttaat atcgctgata 
tagatttcat cctgcaatga agtcatttta 
ttcagtacgg gctttaccca tcttgtaaaa 
acaggttatt attatg 



2 

gtacctttgt taacatcact gctaatcaac 1860 
ccaatggcag cttaactctt tggagtgagg 1920 
acgatattac caccggtgat gataccagag 1980 
gggttgatgt tcataaaaat atctcactcg 2040 
aacaagatat cgcctttgag aaaggaagca 2100 
cctcaggcaa tcaaaaaggt tttagattta 2160 
gactgcaatt caccactaaa agaaccaata 2220 
ctttaaatat ttcagggaaa gtgaacatct 22 80 
atgataaatt caaaggacgc acttactgga 2340 
gcgagtttaa cctcactatt gactccagag 24 00 
cttataattt aaacggtata tcattcaaca 2460 
caagagtcaa ctttgacatc aaggcaccaa 2520 
acgcatcatt taatggaaac atttcagttt 2580 
tcgcctcatc ctctaacgtc caaacccccg 2640 
tttcaacagg gtcaagttta agatttaaaa 2700 
tagagaaaga tttaacttta aatgccaccg 2760 
gcaccgatgg aatgattggt aaaggcattg 2820 
gtaacatcac ctttggctcc aggaaagccg 28 80 
ataacaacgc taacgtcact cttatcggtt 2940 
ctattaaaaa agatgtcatc attaatagc'g 3000 
atatagccgg aaatcttacc gttgaaagta 3060 
cttttaatgt aggcggcttg tttgacaaca 3120 
gaggggctcg ctttaaagac attgataatt 3180 
gctccactta ccgcactatt ataagcggca 3240 
ttacgaacga aggtagtgat actgaaatgc 33 00 
gtaatctcac gatttcttct gacaaaatca 3360 
gtgttgatgg ggagaattcc gattcagacg 3420 
ccaaagaatt gaaattaacg caagacctaa 34 80 
cagctaaaga tggtagtgat ttaactattg 3540 
ccaaaaaagt aacctttaac caggttaaag 3600 
tgacactaca cagcaaagtg gaaacatccg 3660 
acaataatgc cggcttaact atcgatgcaa 3720 
ctcacaaagc agtgagcatc tctgcgacaa 3780 
ccattaacgc aaccactggt aacgtggaga 3840 
gaattgagtc cagctctggc tctgtaacac 3900 
gcaatatttc gggcaacacc gttactgtta 3960 
caggctctac aattaaagga accgagagtg 4020 
gcggtacgat ttctggtggc acagtagagg 4 080 
ccaattcaaa aattaaagca acaacaggcg 4140 
ttggtggtac gatttccggt aatacggtaa 42 00 
ttgggaatgg cgcagaaatt aatgcgacag 4260 
gcaaattaac taccgaagct agttcacaca 4320 
cagctcagga tggtagcgtt gcaggaagta 43 80 
caggcacttt aactaccgtg aagggttcaa 4440 
ttaacgcaaa agacgctgag ctaaatggcg 4500 
caaccaacgc aaatggctcc ggcagcgtaa 4560 
ctggggattt aatcacaata aatggattaa 4620 
tactgttaaa aggcgttaaa attgatgtga 4680 
atgaagtaat tgaagcgaaa cgcatccttg 4740 
gagaagcgtt agctaaactt ggagtaagtg 4800 
ttacagtcga tacacaaaat gaatttgcaa 4860 
aaggcagggc gtgtttctca aacagtgatg 4920 
acgggcggta gcggtcagta attgacaagg 4980 
ttttcgtatt atttactgtg tgggttaaag 5040 
aattacggag aatacaataa agtattttta 5100 
5116 



<210> 2 
<211> 1536 
<212> PRT 

<213> Haemophilus influenzae 
<400> 2 

Met Asn Lys lie Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 
15 10 15 

Val Ala Val Ser Glu Leu Ala Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

Gly Ser Glu Lys Pro Ala Arg Met Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala Met Leu Leu Ser Leu Gly Val Thr Ser He Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Asp Val Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr He He Arg Asn Ser Val 



Asp Ala He He Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 110 

Val Gin Phe Leu Gin Glu Asn Asn Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 

Thr Ser Asn Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 
130 135 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
145 150 155 160 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 
165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Phe Glu Gin Thr Lys Asp Lys 
180 185 190 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 240 

He Ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala Val Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Gin Gly Lys Leu Ser Ala 
275 280 285 





4 



Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
290 2 95 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 
340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Ser Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Arg Ala He Val Trp Gly Asp He Ala Leu He Asp 
385 390 395 400 

Gly Asn He Asn Ala Gin Gly Ser Gly Asp He Ala Lys Thr Gly Gly 
405 410 415 

Phe Val Glu Thr Ser Gly His Asp Leu Phe He Lys Asp Asn Ala He 
420 425 430 

Val Asp Ala Lys Glu Trp Leu Leu Asp Phe Asp Asn Val Ser He Asn 
435 440 445 

Ala Glu Thr Ala Gly Arg Ser Asn Thr Ser Glu Asp Asp Glu Tyr Thr 
450 455 460 

Gly Ser Gly Asn Ser Ala Ser Thr Pro Lys Arg Asn Lys Glu Lys Thr 
465 470 475 480 

Thr Leu Thr Asn Thr Thr Leu Glu Ser He Leu Lys Lys Gly Thr Phe 
485 490 495 

Val Asn He Thr Ala Asn Gin Arg He Tyr Val Asn Ser Ser He Asn 
500 505 510 

Leu Ser Asn Gly Ser Leu Thr Leu Trp Ser Glu Gly Arg Ser Gly Gly 
515 520 525 

Gly Val Glu He Asn Asn Asp He Thr Thr Gly Asp Asp Thr Arg Gly 
530 535 540 

Ala Asn Leu Thr He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn 
545 550 555 560 

He Ser Leu Gly Ala Gin Gly Asn He Asn He Thr Ala Lys Gin Asp 
565 570 575 

He Ala Phe Glu Lys Gly Ser Asn Gin Val He Thr Gly Gin Gly Thr 



580 



585 



590 





5 



He Thr Ser Gly Asn Gin Lys Gly Phe Arg Phe Asn Asn Val Ser Leu 
595 600 605 

Asn Gly Thr Gly Ser Gly Leu Gin Phe Thr Thr Lys Arg Thr Asn Lys 
610 615 620 

Tyr Ala He Thr Asn Lys Phe Glu Gly Thr Leu Asn He Ser Gly Lys 
625 630 635 640 

Val Asn He Ser Met Val Leu Pro Lys Asn Glu Ser Gly Tyr Asp Lys 
645 650 655 

Phe Lys Gly Arg Thr Tyr Trp Asn Leu Thr Ser Leu Asn Val Ser Glu 
660 665 670 

Ser Gly Glu Phe Asn Leu Thr He Asp Ser Arg Gly Ser Asp Ser Ala 
675 680 685 

Gly Thr Leu Thr Gin Pro Tyr Asn Leu Asn Gly He Ser Phe Asn Lys 
690 695 700 

Asp Thr Thr Phe Asn Val Glu Arg Asn Ala Arg Val Asn Phe Asp He 
705 710 715 720 

Lys Ala Pro He Gly He Asn Lys Tyr Ser Ser Leu Asn Tyr Ala Ser 
725 730 735 

Phe Asn Gly Asn He Ser Val Ser Gly Gly Gly Ser Val Asp Phe Thr 
740 745 750 

Leu Leu Ala Ser Ser Ser Asn Val Gin Thr Pro Gly Val Val He Asn 
755 760 765 

Ser Lys Tyr Phe Asn Val Ser Thr Gly Ser Ser Leu Arg Phe Lys Thr 
770 775 780 

Ser Gly Ser Thr Lys Thr Gly Phe Ser He Glu Lys Asp Leu Thr Leu 
785 790 795 800 

Asn Ala Thr Gly Gly Asn He Thr Leu Leu Gin Val Glu Gly Thr Asp 
805 810 815 

Gly Met He Gly Lys Gly He Val Ala Lys Lys Asn He Thr Phe Glu 
820 825 830 

Gly Gly Asn He Thr Phe Gly Ser Arg Lys Ala Val Thr Glu He Glu 
835 840 845 

Gly Asn Val Thr He Asn Asn Asn Ala Asn Val Thr Leu He Gly Ser 
850 855 860 

Asp Phe Asp Asn His Gin Lys Pro Leu Thr He Lys Lys Asp Val He 
865 870 875 880 

He Asn Ser Gly Asn Leu Thr Ala Gly Gly Asn He Val Asn He Ala 
885 890 895 

Gly Asn Leu Thr Val Glu Ser Asn Ala Asn Phe Lys Ala He Thr Asn 



900 



905 



910 
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Phe Thr Phe Asn Val Gly Gly Leu Phe Asp Asn Lys Gly Asn Ser Asn 
915 920 925 

He Ser He Ala Lys Gly Gly Ala Arg Phe Lys Asp He Asp Asn Ser 
930 935 940 

Lys Asn Leu Ser He Thr Thr Asn Ser Ser Ser Thr Tyr Arg Thr He 
945 950 955 960 

He Ser Gly Asn He Thr Asn Lys Asn Gly Asp Leu Asn He Thr Asn 
965 970 975 

Glu Gly Ser Asp Thr Glu Met Gin He Gly Gly Asp Val Ser Gin Lys 
980 985 990 

Glu Gly Asn Leu Thr He Ser Ser Asp Lys He Asn He Thr Lys Gin 
995 1000 1005 

He Thr He Lys Ala Gly Val Asp Gly Glu Asn Ser Asp Ser Asp Ala 
1010 1015 1020 

Thr Asn Asn Ala Asn Leu Thr He Lys Thr Lys Glu Leu Lys Leu Thr 
1025 1030 1035 1040 

Gin Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr Ala Lys 
1045 1050 1055 

Asp Gly Ser Asp Leu Thr He Gly Asn Thr Asn Ser Ala Asp Gly Thr 
1060 1065 1070 

Asn Ala Lys Lys Val Thr Phe Asn Gin Val Lys Asp Ser Lys He Ser 
1075 1080 1085 

Ala Asp Gly His Lys Val Thr Leu His Ser Lys Val Glu Thr Ser Gly 
1090 1095 1100 

Ser Asn Asn Asn Thr Glu Asp Ser Ser Asp Asn Asn Ala Gly Leu Thr 
1105 1110 1115 1120 

He Asp Ala Lys Asn Val Thr Val Asn Asn Asn He Thr Ser His Lys 
1125 1130 1135 

Ala Val Ser He Ser Ala Thr Ser Gly Glu He Thr Thr Lys Thr Gly 
1140 1145 1150 

Thr Thr He Asn Ala Thr Thr Gly Asn Val Glu He Thr Ala Gin Thr 
1155 1160 1165 

Gly Ser He Leu Gly Gly He Glu Ser Ser Ser Gly Ser Val Thr Leu 
1170 1175 1180 

Thr Ala Thr Glu Gly Ala Leu Ala Val Ser Asn He Ser Gly Asn Thr 
1185 1190 1195 1200 

Val Thr Val Thr Ala Asn Ser Gly Ala Leu Thr Thr Leu Ala Gly Ser 



1205 



1210 



1215 
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Thr He Lys Gly Thr Glu Ser Val Thr Thr Ser Ser Gin Ser Gly Asp 
1220 1225 1230 

He Gly Gly Thr He Ser Gly Gly Thr Val Glu Val Lys Ala Thr Glu 
1235 1240 1245 

Ser Leu Thr Thr Gin Ser Asn Ser Lys He Lys Ala Thr Thr Gly Glu 
1250 1255 1260 

Ala Asn Val Thr Ser Ala Thr Gly Thr He Gly Gly Thr He Ser Gly 
1265 1270 1275 1280 

Asn Thr Val Asn Val Thr Ala Asn Ala Gly Asp Leu Thr Val Gly Asn 
1285 1290 1295 

Gly Ala Glu He Asn Ala Thr Glu Gly .Ala Ala Thr Leu Thr Thr Ser 
1300 1305 1310 

Ser Gly Lys Leu Thr Thr Glu Ala Ser Ser His He Thr Ser Ala Lys 
1315 1320 1325 

Gly Gin Val Asn Leu Ser Ala Gin Asp Gly Ser Val Ala Gly Ser He 
1330 1335 1340 

Asn Ala Ala Asn Val Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Val 
1345 1350 1355 1360 

Lys Gly Ser Asn He Asn Ala Thr Ser Gly Thr Leu Val He Asn Ala 
1365 1370 1375 

Lys Asp Ala Glu Leu Asn Gly Ala Ala Leu Gly Asn His Thr Val Val 
1380 1385 1390 

Asn Ala Thr Asn Ala Asn Gly Ser Gly Ser Val He Ala Thr Thr Ser 
1395 1400 1405 

Ser Arg Val Asn He Thr Gly Asp Leu He Thr He Asn Gly Leu Asn 
1410 1415 1420 

He He Ser Lys Asn Gly He Asn Thr Val Leu Leu Lys Gly Val Lys 
1425 1430 1435 1440 

He Asp Val Lys Tyr He Gin Pro Gly He Ala Ser Val Asp Glu Val 
1445 1450 1455 

He Glu Ala Lys Arg He Leu Glu Lys Val Lys Asp Leu Ser Asp Glu 
1460 1465 1470 

Glu Arg Glu Ala Leu Ala Lys Leu Gly Val Ser Ala Val Arg Phe He 
1475 1480 1485 

Glu Pro Asn Asn Thr He Thr Val Asp Thr Gin Asn Glu Phe Ala Thr 
1490 1495 1500 

Arg Pro Leu Ser Arg He Val He Ser Glu Gly Arg Ala Cys Phe Ser 
1505 1510 1515 1520 

Asn Ser Asp Gly Ala Thr Val Cys Val Asn He Ala Asp Asn Gly Arg 



1525 



1530 



1535 



<210> 3 

<211> 4937 

<212> DNA 

<213> Haemophilus influenzae 



<400> 3 

taaatataca agataataaa aataaatcaa gatttttgtg atgacaaaca acaattacaa 60 
cacctttttt gcagtctata tgcaaatatt ttaaaaaaat agtataaatc cgccatataa 120 
aatggtataa tctttcatct ttcatcttta atctttcatc tttcatcttt catctttcat 180 
ctttcatctt tcatctttca tctttcatct ttcatctttc atctttcatc tttcatcttt 240 
cacatgaaat gatgaaccga gggaagggag ggaggggcaa gaatgaagag ggagctgaac 3 00 
gaacgcaaat gataaagtaa tttaattgtt caactaacct taggagaaaa tatgaacaag 360 
atatatcgtc tcaaattcag caaacgcctg aatgctttgg ttgctgtgtc tgaattggca 420 
cggggttgtg accattccac agaaaaaggc ttccgctatg ttactatctt taggtgtaac 480 
cacttagcgt taaagccact ttccgctatg ttactatctt taggtgtaac atctattcca 540 
caatctgttt tagcaagcgg cttacaagga atggatgtag tacacggcac agccactatg 600 
caagtagatg gtaataaaac cattatccgc aacagtgttg acgctatcat taattggaaa 660 
caatttaaca tcgaccaaaa tgaaatggtg cagtttttac aagaaaacaa caactccgcc 720 
gtattcaacc gtgttacatc taaccaaatc tcccaattaa aagggatttt agattctaac 7 80 
ggacaagtct ttttaatcaa cccaaatggt atcacaatag gtaaagacgc aattattaac 84 0 
actaatggct ttacggcttc tacgctagac atttctaacg aaaacatcaa ggcgcgtaat 900 
ttcaccttcg agcaaaccaa agataaagcg ctcgctgaaa ttgtgaatca cggtttaatt 960 
actgtcggta aagacggcag tgtaaatctt attggtggca aagtgaaaaa cgagggtgtg 1020 
attagcgtaa atggtggcag catttcttta ctcgcagggc aaaaaatcac catcagcgat 1080 
ataataaacc caaccattac ttacagcatt gccgcgcctg aaaatgaagc ggtcaatctg 1140 
ggcgatattt ttgccaaagg cggtaacatt aatgtccgtg ctgccactat tcgaaaccaa 1200 
ggtaaacttt ctgctgattc tgtaagcaaa gataaaagcg gcaatattgt tctttccgcc 1260 
aaagagggtg aagcggaaat tggcggtgta atttccgctc aaaatcagca agctaaaggc 1320 
ggcaagctga tgattacagg cgataaagtc acattaaaaa caggtgcagt tatcgacctt 1380 
tcaggtaaag aagggggaga aacttacctt ggcggtgacg agcgcggcga aggtaaaaac 1440 
ggcattcaat tagcaaagaa aacctcttta gaaaaaggct caaccatcaa tgtatcaggc 1500 
aaagaaaaag gcggacgcgc tattgtgtgg ggcgatattg cgttaattga cggcaatatt 1560 
aacgctcaag gtagtggtga tatcgctaaa accggtggtt ttgtggagac atcggggcat 1620 
tatttatcca ttgacagcaa tgcaattgtt aaaacaaaag agtggttgct agaccctgat 1680 
gatgtaacaa ttgaagccga agaccccctt cgcaataata ccggtataaa tgatgaattc 1740 
ccaacaggca ccggtgaagc aagcgaccct aaaaaaaata gcgaactcaa aacaacgcta 1800 
accaatacaa ctatttcaaa ttatctgaaa aacgcctgga caatgaatat aacggcatca 1860 
agaaaactta ccgttaatag ctcaatcaac atcggaagca actcccactt aattctccat 1920 
agtaaaggtc agcgtggcgg aggcgttcag attgatggag atattacttc taaaggcgga 1980 
aatttaacca tttattctgg cggatgggtt gatgttcata aaaatattac gcttgatcag 2 040 
ggttttttaa atattaccgc cgcttccgta gcttttgaag gtggaaataa caaagcacgc 2100 
gacgcggcaa atgctaaaat tgtcgcccag ggcactgtaa ccattacagg agagggaaaa 2160 
gatttcaggg ctaacaacgt atctttaaac ggaacgggta aaggtctgaa tatcatttca 2220 
tcagtgaata atttaaccca caatcttagt ggcacaatta acatatctgg gaatataaca 2280 
attaaccaaa ctacgagaaa gaacacctcg tattggcaaa ccagccatga ttcgcactgg 2 340 
aacgtcagtg ctcttaatct agagacaggc gcaaatttta cctttattaa atacatttca 2400 
agcaatagca aaggcttaac aacacagtat agaagctctg caggggtgaa ttttaacggc 2460 
gtaaatggca acatgtcatt caatctcaaa gaaggagcga aagttaattt caaattaaaa 2520 
ccaaacgaga acatgaacac aagcaaacct ttaccaattc ggtttttagc caatatcaca 2 580 
gccactggtg ggggctctgt tttttttgat atatatgcca accattctgg cagaggggct 2640 
gagttaaaaa tgagtgaaat taatatctct aacggcgcta attttacctt aaattcccat 2700 
gttcgcggcg atgacgcttt taaaatcaac aaagacttaa ccataaatgc aaccaattca 2 760 
aatttcagcc tcagacagac gaaagatgat ttttatgacg ggtacgcacg caatgccatc 2 820 
aattcaacct acaacatatc cattctgggc ggtaatgtca cccttggtgg acaaaactca 2 880 



agcagcagca ttacggggaa tattactatc gagaaagcag caaatgttac 9Ctagaagcc 2940 
aataacgccc ctaatcagca aaacataagg gatagagtta taaaacttgg cagcttgctc 3000 
gttaatggga gtttaagttt aactggcgaa aatgcagata ttaaaggcaa tctcactatt 3060 
tcagaaagcg ccacttttaa aggaaagact agagataccc taaatatcac cggcaatttt 3120 
accaataatg gcactgccga aattaatata acacaaggag tggtaaaact tggcaatgtt 3180 
accaatgatg gtgatttaaa cattaccact cacgctaaac gcaaccaaag aagcatcatc 3240 
ggcggagata taatcaacaa aaaaggaagc ttaaatatta cagacagtaa taatgatgct 33 00 
gaaatccaaa ttggcggcaa tatctcgcaa aaagaaggca acctcacgat ttcttccgat 3360 
aaaattaata tcaccaaaca gataacaatc aaaaagggta ttgatggaga ggactctagt 3420 
tcagatgcga caagtaatgc caacctaact attaaaacca aagaattgaa attgacagaa 3480 
gacctaagta tttcaggttt caataaagca gagattacag ccaaagatgg tagagattta 3540 
actattggca acagtaatga cggtaacagc ggtgccgaag ccaaaacagt aacttttaac 3600 
aatgttaaag attcaaaaat ctctgctgac ggtcacaatg tgacactaaa tagcaaagtg 3660 
aaaacatcta gcagcaatgg cggacgtgaa agcaatagcg acaacgatac ^ggcttaact 3720 
attactgcaa aaaatgtaga agtaaacaaa gatattactt ctctcaaaac agtaaatatc 3780 
accgcgtcgg aaaaggttac caccacagca ggctcgacca ttaacgcaac aaatggcaaa 3 840 
gcaagtatta caaccaaaac aggtgatatc agcggtacga tttccggtaa cacggtaagt 3900 
qttagcgcga ctggtgattt aaccactaaa tccggctcaa aaattgaagc gaaatcgggt 3 96 0 
gaggctaatg taacaagtgc aacaggtaca attggcggta caatttccgg taatacggta 4020 
aatgttacgg caaacgctgg cgatttaaca gttgggaatg gcgcagaaat taatgcgaca 4080 
gaaggagctg caaccttaac cgcaacaggg aataccttga ctactgaagc cggttctagc 4140 
atcacttcaa ctaagggtca ggtagacctc ttggctcaga atggtagcat cgcaggaagc 4200 
attaatgctg ctaatgtgac attaaatact acaggcacct taaccaccgt ggcaggctcg 4260 
gatattaaag caaccagcgg caccttggtt attaacgcaa aagatgctaa gctaaatggt 4320 
gatgcc- • Ng gtgatagtac agaagtgaat gcagtcaacg caagcggctc tggtagtgtg 4380 
actgcgV-^a cctcaagcag- tgtgaatatc actggggatt taaacacagt aaatgggtta 444 0 
aatatcattt cgaaagatgg tagaaacact gtgcgcttaa gaggcaagga aattgaggtg 4500 
aaatatatcc agccaggtgt agcaagtgta gaagaagtaa ttgaagcgaa acgcgtcctt 4560 
gaaaaagtaa aagatttatc tgatgaagaa agagaaacat tagctaaact tggtgtaagt 4620 
gctgtacgtt ttgttgagcc aaataataca attacagtca atacacaaaa tgaatttaca 4680 
accagaccgt caagtcaagt gataatttct gaaggtaagg cgtgtttctc aagtggtaat 4740 
ggcgcacgag tatgtaccaa tgttgctgac gatggacagc cgtagtcagt aattgacaag 4800 
ILgatttca tcctgcaatg aagtcatttt attttcgtat tatttactgt Qtgggttaaa 4860 
gttcagtacg ggctttaccc atcttgtaaa aaattacgga gaatacaata aagtattttt 4920 
aacaggttat tattatg 



<210> 4 
<211> 1477 
<212> PRT 

<213> Haemophilus influenzae 

Met°Asn Lys He Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 
15 10 15 

Val Ala Val Ser Glu Leu Ala Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

Gly Ser Glu Lys Pro Ala Arg Met Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala Met Leu Leu Ser Leu Gly Val Thr Ser He Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Asp Val Val His Gly Thr 
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Ala Thr Met Gin Val Asp Gly Asn Lys Thr He He Arg Asn Ser Val 



Asp Ala He He Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 110 

Val Gin Phe Leu Gin Glu Asn Asn Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 

Thr Ser Asn Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 
130 135 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
145 150 155 160 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 
165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Phe Glu Gin Thr Lys Asp Lys 
180 185 190 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 240 

He Ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala Val Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Gin Gly Lys Leu Ser Ala 
275 280 285 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
290 295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 



Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Ser Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Phe Ala He Val Trp Gly Asp He Ala Leu He Asp 
385 390 395 400 



Gly Asn He Asn Ala Gin Gly Ser Gly Asp He Ala Lys Thr Gly Gly 
405 410 415 

Phe Val Glu Thr Ser Gly His Asp Leu Phe He Lys Asp Asn Ala He 
420 425 430 

Val Asp Ala Lys Glu Trp Leu Leu Asp Phe Asp Asn Val Ser He Asn 
435 440 445 

Ala Glu Asp Pro Leu Phe Asn Asn Thr Gly He Asn Asp Glu Phe Pro 
450 455 460 

Thr Gly Thr Gly Glu Ala Ser Asp Pro Lys Lys Asn Ser Glu Leu Lys 
465 470 475 480 

Thr Thr Leu Thr Asn Thr Thr He Ser Asn Tyr Leu Lys Asn Ala Trp 
485 490 495 

Thr Met Asn He Thr Ala Ser Arg Lys Leu Thr Val Asn Ser Ser He 
500 505 510 

Asn He Gly Ser Asn Ser His Leu He Leu His Ser Lys Gly Gin Arg 
515 520 525 

Gly Gly Gly Val Gin He Asp Gly Asp He Thr Ser Lys Gly Gly Asn 
530 535 540 

Leu Thr He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn He Thr 
545 550 555 560 

Leu Asp Gin Gly Phe Leu Asn He Thr Ala Ala Ser Val Ala Phe Glu 
565 570 575 

Gly Gly Asn Asn Lys Ala Arg Asp Ala Ala Asn Ala Lys He Val Ala 
580 585 590 

Gin Gly Thr Val Thr He Thr Gly Glu Gly Lys Asp Phe Arg Ala Asn 
595 600 605 

Asn Val Ser Leu Asn Gly Thr Gly Lys Gly Leu Asn He He Ser Ser 
610 615 620 

Val Asn Asn Leu Thr His Asn Leu Ser Gly Thr He Asn He Ser Gly 
625 630 635 640 

Asn He Thr He Asn Gin Thr Thr Arg Lys Asn Thr Ser Tyr Trp Gin 
645 650 655 

Thr Ser His Asp Ser His Trp Asn Val Ser Ala Leu Asn Leu Glu Thr 
660 665 670 

Gly Ala Asn Phe Thr Phe He Lys Tyr He Ser Ser Asn Ser Lys Gly 
675 680 685 

Leu Thr Thr Gin Tyr Arg Ser Ser Ala Gly Val Asn Phe Asn Gly Val 
690 695 700 



Asn Gly Asn Met Ser Phe Asn Leu Lys Glu Gly Ala Lys Val Asn Phe 
705 710 715 720 

Lys Leu Lys Pro Asn Glu Asn Met Asn Thr Ser Lys Pro Leu Pro He 
725 730 735 

Arg Phe Leu Ala Asn He Thr Ala Thr Gly Gly Gly Ser Val Phe Phe 
740 745 750 

Asp He Tyr Ala Asn His Ser Gly Arg Gly Ala Glu Leu Lys Met Ser 
755 760 765 

Glu He Asn He Ser Asn Gly Ala Asn Phe Thr Leu Asn Ser His Val 
770 775 780 

Arg Gly Asp Asp Ala Phe Lys He Asn Lys Asp Leu Thr He Asn Ala 
785 790 795 800 

Thr Asn Ser Asn Phe Ser Leu Arg Gin Thr Lys Asp Asp Phe Tyr Asp 
805 810 815 

Gly Tyr Ala Arg Asn Ala He Asn Ser Thr Tyr Asn He Ser He Leu 
820 825 830 

Gly Gly Asn Val Thr Leu Gly Gly Gin Asn Ser Ser Ser Ser He Thr 
835 840 845 

Gly Asn He Thr He Glu Lys Ala Ala Asn Val Thr Leu Glu Ala Asn 
850 855 860 

Asn Ala Pro Asn Gin Gin Asn He Arg Asp Arg Val He Lys Leu Gly 
865 870 875 880 

Ser Leu Leu Val Asn Gly Ser Leu Ser Leu Thr Gly Glu Asn Ala Asp 
885 890 895 

He Lys Gly Asn Leu Thr He Ser Glu Ser Ala Thr Phe Lys Gly Lys 
900 905 910 

Thr Arg Asp Thr Leu Asn He Thr Gly Asn Phe Thr Asn Asn Gly Thr 
915 920 925 

Ala Glu He Asn He Thr Gin Gly Val Val Lys Leu Gly Asn Val Thr 
930 935 940 

Asn Asp Gly Asp Leu Asn He Thr Thr His Ala Lys Arg Asn Gin Arg 
945 950 955 960 

Ser He He Gly Gly Asp He He Asn Lys Lys Gly Ser Leu Asn He 
965 970 975 

Thr Asp Ser Asn Asn Asp Ala Glu He Gin He Gly Gly Asn He Ser 
980 985 990 

Gin Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys He Asn He Thr 
995 1000 1005 

Lys Gin He Thr He Lys Lys Gly He Asp Gly Glu Asp Ser Ser Ser 
1010 1015 1020 



Asp Ala Thr Ser Asn Ala Asn Leu Thr He Lys Thr Lys Glu Leu Lys 
1025 1030 1035 1040 

Leu Thr Glu Asp Leu Ser He Ser Gly Phe Asn Lys Ala Glu He Thr 
1045 1050 1055 

Ala Lys Asp Gly Arg Asp Leu Thr He Gly Asn Ser Asn Asp Gly Asn 
1060 1065 1070 

Ser Gly Ala Glu Ala Lys Thr Val Thr Phe Asn Asn Val Lys Asp Ser 
1075 1080 1085 

Lys He Ser Ala Asp Gly His Asn Val Thr Leu Asn Ser Lys Val Lys 
1090 1095 1100 

Thr Ser Ser Ser Asn Gly Gly Arg Glu Ser Asn Ser Asp Asn Asp Thr 
1105 1110 1115 1120 

Gly Leu Thr He Thr Ala Lys Asn Val Glu Val Asn Lys Asp He Thr 
1125 1130 1135 

Ser Leu Lys Thr Val Asn He Thr Ala Ser Glu Lys Val Thr Thr Thr 
1140 1145 1150 

Ala Gly Ser Thr He Asn Ala Thr Asn Gly Lys Ala Ser He Thr Thr 
1155 1160 1165 

Lys Thr Gly Asp He Ser Gly Thr He Ser Gly Asn Thr Val Ser Val 
1170 1175 1180 

Ser Ala Thr Val Asp Leu Thr Thr Lys Ser Gly Ser Lys He Glu Ala 
1185 1190 1195 1200 

Lys Ser Gly Glu Ala Asn Val Thr Ser Ala Thr Gly Thr He Gly Gly 
1205 1210 1215 

Thr He Ser Gly Asn Thr Val Asn Val Thr Ala Asn Ala Gly Asp Leu 
1220 1225 1230 

Thr Val Gly Asn Gly Ala Glu He Asn Ala Thr Glu Gly Ala Ala Thr 
1235 1240 1245 

Leu Thr Ala Thr Gly Asn Thr Leu Thr Thr Glu Ala Gly Ser Ser He 
1250 1255 1260 

Thr Ser Thr Lys Gly Gin Val Asp Leu Leu Ala Gin Asn Gly Ser He 
1265 1270 1275 1280 

Ala Gly Ser He Asn Ala Ala Asn Val Thr Leu Asn Thr Thr Gly Thr 
1285 1290 1295 

Leu Thr Thr Val Ala Gly Ser Asp He Lys Ala Thr Ser Gly Thr Leu 
1300 1305 1310 

Val He Asn Ala Lys Asp Ala Lys Leu Asn Gly Asp Ala Ser Gly Asp 
1315 1320 1325 



Ser Thr Glu Val Asn Ala Val Asn Ala Ser Gly Ser Gly Ser Val Thr 
1330 1335 1340 

Ala Ala Thr Ser Ser Ser Val Asn He Thr Gly Asp Leu Asn Thr Val 
1345 1350 1355 . 1360 

Asn Gly Leu Asn He He Ser Lys Asp Gly Arg Asn Thr Val Arg Leu 
1365 1370 1375 

Arg Gly Lys Glu He Glu Val Lys Tyr He Gin Pro Gly Val Ala Ser 
1380 1385 1390 

Val Glu Glu Val He Glu Ala Lys Arg Val Leu Glu Lys Val Lys Asp 
1395 1400 1405 

Leu Ser Asp Glu Glu Arg Glu Thr Leu Ala Lys Leu Gly Val Ser Ala 
1410 1415 1420 

Val Arg Phe Val Glu Pro Asn Asn Thr He Thr Val Asn Thr Gin Asn 
1425 1430 1435 1440 

Glu Phe Thr Thr Arg Pro Ser Ser Gin Val He He Ser Glu Gly Lys 
1445 1450 1455 

Ala Cys Phe Ser Ser Gly Asn Gly Ala Arg Val Cys Thr Asn Val Ala 
1460 1465 1470 

Asp Asp Gly Gin Pro 
1475 



<210> 5 
<211> 9171 
<212> DNA 

<213> Haemophilus influenzae 
<400> 5 

acagcgttct cttaatacta gtacaaaccc acaataaaat atgacaaaca acaattacaa 60 
cacctttttt gcagtctata tgcaaatatt ttaaaaaata gtataaatcc gccatataaa 120 
atggtataat ctttcatctt tcatctttca tctttcatct ttcatctttc atctttcatc 180 
tttcatcttt catctttcat ctttcatctt tcatctttca tctttcatct ttcatctttc 240 
acatgaaatg atgaaccgag ggaagggagg gaggggcaag aatgaagagg gagctgaacg 3 00 
aacgcaaatg ataaagtaat ttaattgttc aactaacctt aggagaaaat atgaacaaga 360 
tatatcgtct caaattcagc aaacgcctga atgctttggt tgctgtgtct gaattggcac 420 
ggggttgtga ccattccaca gaaaaaggca gcgaaaaacc tgctcgcatg aaagtgcgtc 4 80 
acttagcgtt aaagccactt tccgctatgt tactatcttt aggtgtaaca tctattccac 540 
aatctgtttt agcaagcggc ttacaaggaa tggatgtagt acacggcaca gccactatgc 600 
aagtagatgg taataaaacc attatccgca acagtgttga cgctatcatt aattggaaac 660 
aatttaacat cgaccaaaat gaaatggtgc agtttttaca agaaaacaac aactccgccg 720 
tattcaaccg tgttacatct aaccaaatct cccaattaaa agggatttta gattctaacg 780 
gacaagtctt tttaatcaac ccaaatggta tcacaatagg taaagacgca attattaaca 84 0 
ctaatggctt tacggcttct acgctagaca tttctaacga aaacatcaag gcgcgtaatt 900 
tcaccttcga gcaaaccaaa gataaagcgc tcgctgaaat tgtgaatcac ggtttaatta 960 
ctgtcggtaa agacggcagt gtaaatctta ttggtggcaa agtgaaaaac gagggtgtga 102 0 
ttagcgtaaa tggtggcagc atttctttac tcgcagggca aaaaatcacc atcagcgata 1080 
taataaaccc aaccattact tacagcattg ccgcgcctga aaatgaagcg gtcaatctgg 114 0 
gcgatatttt tgccaaaggc ggtaacatta atgtccgtgc tgccactatt cgaaaccaag 1200 
ctttccgcca aagagggtga agcggaaatt ggcggtgtaa tttccgctca aaatcagcaa 1260 
gctaaaggcg gcaagctgat gattacaggc gataaagtca cattaaaaac aggtgcagtt 1320 
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atcgaccttt caggtaaaga agggggagaa acttaccttg gcggtgacga gcgcggcgaa 13 80 
ggtaaaaacg gcattcaatt agcaaagaaa acctctttag aaaaaggctc aaccatcaat 1440 
gtatcaggca aagaaaaagg cggacgcgct attgtgtggg gcgatattgc gttaattgac 1500 
ggcaatatta acgctcaagg tagtggtgat atcgctaaaa ccggtggttt tgtggagacg 1560 
tcggggcatg atttattcat caaagacaat gcaattgttg acgccaaaga gtggttgtta 1620 
gacccggata atgtatctat taatgcagaa acagcaggac gcagcaatac ttcagaagac 1680 
gatgaataca cgggatccgg gaatagtgcc agcaccccaa aacgaaacaa agaaaagaca 1740 
acattaacaa acacaactct tgagagtata ctaaaaaaag gtacctttgt taacatcact 1800 
gctaatcaac gcatctatgt caatagctcc attaatttat ccaatggcag cttaactctt 1860 
tggagtgagg gtcggagcgg tggcggcgtt gagattaaca acgatattac caccggtgat 1920 
gataccagag gtgcaaactt aacaatttac tcaggcggct gggttgatgt tcataaaaat 1980 
atctcactcg gggcgcaagg taacataaac attacagcta aacaagatat cgcctttgag 2 04 0 
aaaggaagca accaagtcat tacaggtcaa gggactatta cctcaggcaa tcaaaaaggt 2100 
tttagattta ataatgtctc tctaaacggc actggcagcg gactgcaatt caccactaaa 216 0 
agaaccaata aatacgctat cacaaataaa tttgaaggga ctttaaatat ttcagggaaa 2220 
gtgaacatct caatggtttt acctaaaaat gaaagtggat atgataaatt caaaggacgc 2 2 80 
acttactgga atttaacctc gaaagtggat atgataaatt caaaggacgc cctcactatt 2340 
gactccagag gaagcgatag tgcaggcaca cttacccagc cttataattt aaacggtata 2400 
tcattcaaca aagacactac ctttaatgtt gaacgaaatg caagagtcaa ctttgacatc 2460 
aaggcaccaa tagggataaa taagtattct agtttgaatt acgcatcatt taatggaaac 252 0 
atttcagttt cgggaggggg gagtgttgat ttcacacttc tcgcctcatc ctctaacgtc 2580 
caaacccccg gtgtagttat aaattctaaa tactttaatg tttcaacagg gtcaagttta 2640 
agatttaaaa cttcaggctc aacaaaaact ggcttctcaa tagagaaaga tttaacttta 2700 
aatgccaccg gaggcaacat aacacttttg caagttgaag gcaccgatgg aatgattggt 2760 
aaaggcattg tagccaaaaa aaacataacc tttgaaggag gtaagatgag gtttggctcc 2 82 0 
aggaaagccg taacagaaat cgaaggcaat gttactatca ataacaacgc taacgtcact 2880 
cttatcggtt cggattttga caaccatcaa aaacctttaa ctattaaaaa agatgtcatc 2940 
attaatagcg gcaaccttac cgctggaggc aatattgtca atatagccgg aaatcttacc 3000 
gttgaaagta acgctaattt caaagctatc acaaatttca cttttaatgt aggcggcttg 3060 
tttgacaaca aaggcaattc aaatatttcc attgccaaag gaggggctcg ctttaaagac 3120 
attgataatt ccaagaattt aagcatcacc accaactcca gctccactta ccgcactatt 3180 
ataagcggca atataaccaa taaaaacggt gatttaaata ttacgaacga aggtagtgat 324 0 
actgaaatgc aaattggcgg cgatgtctcg caaaaagaag gtaatctcac gatttcttct 3 3 00 
gacaaaatca atattaccaa acagataaca atcaaggcag gtgttgatgg ggagaattcc 3 3 60 
gattcagacg cgacaaacaa tgccaatcta accattaaaa ccaaagaatt gaaattaacg 342 0 
caagacctaa atatttcagg tttcaataaa gcagagatta cagctaaaga tggtagtgat 3480 
ttaactattg gtaacaccaa tagtgctgat ggtactaatg ccaaaaaagt aacctttaac 3540 
caggttaaag attcaaaaat ctctgctgac ggtcacaagg tgacactaca cagcaaagtg 3600 
gaaacatccg gtagtaataa caacactgaa gatagcagtg acaataatgc cggcttaact 3660 
atcgatgcaa aaaatgtaac agtaaacaac aatattactt ctcacaaagc agtgagcatc 3720 
tctgcgacaa gtggagaaat taccactaaa acaggtacaa ccattaacgc aaccactggt 3780 
aacgtggaga taaccgctca aacaggtagt atcctaggtg gaattgagtc cagctctggc 3 84 0 
tctgtaacac ttactgcaac cgagggcgct cttgctgtaa gcaatatttc gggcaacacc 3900 
gttactgtta ctgcaaatag cggtgcatta accactttgg caggctctac aattaaagga 3 960 
accgagagtg taaccacttc aagtcaatca ggcgatatcg gcggtacgat ttctggtggc 4020 
acagtagagg ttaaagcaac cgaaagttta accactcaat ccaattcaaa aattaaagca 4080 
acaacaggcg aggctaacgt aacaagtgca acaggtacaa ttggtggtac gatttccggt 4140 
aatacggtaa atgttacggc aaacgctggc gatttaacag ttgggaatgg cgcagaaatt 4200 
aatgcgacag aaggagctgc aaccttaact acatcatcgg gcaaattaac taccgaagct 4260 
agttcacaca ttacttcagc caagggtcag gtaaatcttt cagctcagga tggtagcgtt 4320 
gcaggaagta ttaatgccgc caatgtgaca ctaaatacta caggcacttt aactaccgtg 4380 
aagggttcaa acattaatgc aaccagcggt accttggtta ttaacgcaaa agacgctgag 4440 
ctaaatggcg cagcattggg taaccacaca gtggtaaatg caaccaacgc aaatggctcc 4500 
ggcagcgtaa tcgcgacaac ctcaagcaga gtgaacatca ctggggattt aatcacaata 4560 
aatggattaa atatcatttc aaaaaacggt ataaacaccg tactgttaaa aggcgttaaa 4620 
attgatgtga aatacattca accgggtata gcaagcgtag atgaagtaat tgaagcgaaa 4680 
cgcatccttg agaaggtaaa agatttatct gatgaagaaa gagaagcgtt agctaaactt 4740 
ggcgtaagtg ctgtacgttt tattgagcca aataatacaa ttacagtcga tacacaaaat 4800 
gaatttgcaa ccagaccatt aagtcgaata gtgatttctg aaggcagggc gtgtttctca 4860 
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aacagtgatg gcgcgacggt gtgcgttaat atcgctgata acgggcggta gcggtcagta 4920 
attgacaagg tagatttcat cctgcaatga agtcatttta ttttcgtatt atttactgtg 4980 
tgggttaaag ttcagtacgg gctttaccca tcttgtaaaa aattacggag aatacaataa 504 0 
agtattttta acaggttatt attatgaaaa atataaaaag cagattaaaa ctcagtgcaa 5100 
tatcagtatt gcttggcctg gcttcttcat cattgtatgc, agaagaagcg tttttagtaa 5160 
aaggctttca gttatctggt gcacttgaaa ctttaagtga agacgcccaa ctgtctgtag 5220 
caaaatcttt atctaaatac caaggctcgc aaactttaac aaacctaaaa acagcacagc 52 80 
ttgaattaca ggctgtgcta gataagattg agccaaataa gtttgatgtg atattgccac 53 4 0 
aacaaaccat tacggatggc aatattatgt ttgagctagt ctcgaaatca gccgcagaaa 54 00 
gccaagtttt ttataaggcg agccagggtt atagtgaaga aaatatcgct cgtagcctgc 54 60 
catctttgaa acaaggaaaa gtgtatgaag atggtcgtca gtggttcgat ttgcgtgaat 5520 
tcaatatggc aaaagaaaat ccacttaaag tcactcgcgt gcattacgag ttaaacccta 5580 
aaaacaaaac ctctgatttg gtagttgcag gtttttcgcc ttttggcaaa acgcgtagct 5640 
ttgtttccta tgataatttc ggcgcaaggg agtttaacta tcaacgtgta agtctaggtt 5700 
ttgtaaatgc caatttgacc ggacatgatg atgtattaaa tctaaacgca ttgaccaatg 5760 
taaaagcacc atcaaaatct tatgcggtag gcataggata tacttatccg ttttatgata 5820 
aacaccaatc cttaagtctt tataccagca tgagttatgc tgattctaat gatatcgacg 5880 
gcttaccaag tgcgattaat cgtaaattat caaaaggtca atctatctct gcgaatctga 5940 
aatggagtta ttatctcccg acatttaacc ttggaatgga agaccagttt aaaattaatt 6000 
taggctacaa ctaccgccat attaatcaaa catccgagtt aaacaccctg ggtgcaacga 6060 
agaaaaaatt tgcagtatca ggcgtaagtg caggcattga tggacatatc caatttaccc 6120 
ctaaaacaat ctttaatatt gatttaactc atcattatta cgcgagtaaa ttaccaggct 6180 
cttttggaat ggagcgcatt ggcgaaacat ttaatcgcag ctatcacatt agcacagcca 6240 
gtttagggtt gagtcaagag tttgctcaag gttggcattt tagcagtcaa ttatcgggtc 6300 
agtttactct acaagatata agtagcatag atttattctc tgtaacaggt acttatggcg 6360 
tcagaggctt taaatacggc ggtgcaagtg gtgagcgcgg tcttgtatgg cgtaatgaat 6420 
taagtatgcc aaaatacacc cgctttcaaa tcagccctta tgcgttttat gatgcaggtc 6480 
agttccgtta taatagcgaa aatgctaaaa cttacggcga agatatgcac acggtatcct 654 0 
ctgcgggttt aggcattaaa acctctccta cacaaaactt aagcttagat gcttttgttg 6600 
ctcgtcgctt tgcaaatgcc aatagtgaca atttgaatgg caacaaaaaa cgcacaagct 6660 
cacctacaac cttctggggt agattaacat tcagtttcta accctgaaat ttaatcaact 6720 
ggtaagcgtt ccgcctacca gtttataact atatgcttta cccgccaatt tacagtctat 6780 
acgcaaccct gttttcatcc ttatatatca aacaaactaa gcaaaccaag caaaccaagc 6840 
aaaccaagca aaccaagcaa accaagcaaa ccaagcaaac caagcaaacc aagcaaacca 6900 
agcaaaccaa gcaaaccaag caaaccaagc aaaccaagca atgctaaaaa acaatttata 6960 
tgataaacta aaacatactc cataccatgg caatacaagg gatttaataa tatgacaaaa 7020 
gaaaatttac aaagtgttcc acaaaatacg accgcttcac ttgtagaatc aaacaacgac 7080 
caaacttccc tgcaaatact taaacaacca cccaaaccca acctattacg cctggaacaa 7140 
catgtcgcca aaaaagatta tgagcttgct tgccgcgaat taatggcgat tttggaaaaa 7200 
atggacgcta attttggagg cgttcacgat attgaatttg acgcacctgc tcagctggca 7260 
tatctacccg aaaaactact aattcatttt gccactcgtc tcgctaatgc aattacaaca 7320 
ctcttttccg accccgaatt ggcaatttcc gaagaagggg cattaaagat gattagcctg 73 80 
caacgctggt tgacgctgat ttttgcctct tccccctacg ttaacgcaga ccatattctc 7440 
aataaatata atatcaaccc agattccgaa ggtggctttc atttagcaac agacaactct 7500 
tctattgcta aattctgtat tttttactta cccgaatcca atgtcaatat gagtttagat 7560 
gcgttatggg cagggaatca acaactttgt gcttcattgt gttttgcgtt gcagtcttca 7620 
cgttttattg gtactgcatc tgcgtttcat aaaagagcgg tggttttaca gtggtttcct 7680 
aaaaaactcg ccgaaattgc taatttagat gaattgcctg caaatatcct tcatgatgta 7740 
tatatgcact gcagttatga tttagcaaaa aacaagcacg atgttaagcg tccattaaac 7800 
gaacttgtcc gcaagcatat cctcacgcaa ggatggcaag accgctacct ttacacctta 7860 
ggtaaaaagg acggcaaacc tgtgatgatg gtactgcttg aacattttaa ttcgggacat 7920 
tcgatttatc gcacgcattc aacttcaatg attgctgctc gagaaaaatt ctatttagtc 7980 
ggcttaggcc atgagggcgt tgataacata ggtcgagaag tgtttgacga gttctttgaa 8040 
atcagtagca ataatataat ggagagactg ttttttatcc gtaaacagtg cgaaactttc 8100 
caacccgcag tgttctatat gccaagcatt ggcatggata ttaccacgat ttttgtgagc 8160 
aacactcggc ttgcccctat tcaagctgta gccttgggtc atcctgccac tacgcattct 8220 
gaatttattg attatgtcat cgtagaagat gattatgtgg gcagtgaaga ttgttttagc 8280 
gaaacccttt tacgcttacc caaagatgcc ctaccttatg taccatctgc actcgcccca 8340 
caaaaagtgg attatgtact cagggaaaac cctgaagtag tcaatatcgg tattgccgct 8400 
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accacaatga aattaaaccc tgaatttttg ctaacattgc aagaaatcag agataaagct 8460 
aaagtcaaaa tacattttca tttcgcactt ggacaatcaa caggcttgac acacccttat 8520 
gtcaaatggt ttatcgaaag ctatttaggt gacgatgcca ctgcacatcc ccacgcacct 8580 
tatcacgatt atctggcaat attgcgtgat tgcgatatgc tactaaatcc gtttcctttc 8640 
ggtaatacta acggcataat tgatatggtt acattaggtt tagttggtgt atgcaaaacg 87 00 
ggggatgaag tacatgaaca tattgatgaa ggtctgttta aacgcttagg actaccagaa 8760 
tggctgatag ccgacacacg agaaacatat attgaatgtg ctttgcgtct agcagaaaac 8820 
catcaagaac gccttgaact ccgtcgttac atcatagaaa acaacggctt acaaaagctt 8880 
tttacaggcg accctcgtcc attgggcaaa atactgctta agaaaacaaa tgaatggaag 8940 
cggaagcact tgagtaaaaa ataacggttt tttaaagtaa aagtgcggtt aattttcaaa 9000 
gcgttttaaa aacctctcaa aaatcaaccg cacttttatc tttataacgc tcccgcgcgc 9060 
tgacagttta tctctttctt aaaataccca taaaattgtg gcaatagttg ggtaatcaaa 9120 
ttcaattgtt gatacggcaa actaaagacg gcgcgttctt cggcagtcat c 9171 

<210> 6 
<211> 9323 
<212> DNA 

<213> Haemophilus influenzae 
<400> 6 

cgccacttca attttggatt gttgaaattc aactaaccaa aaagtgcggt taaaatctgt 60 
ggagaaaata ggttgtagtg aagaacgagg taattgttca aaaggataaa gctctcttaa 120 
ttgggcattg gttggcgttt ctttttcggt taatagtaaa ttatattctg gacgactatg 180 
caatccacca acaactttac cgttggtttt aagcgttaat gtaagttctt gctcttcttg 240 
gcgaatacgt aatcccattt tttgtttagc aagaaaatga tcgggataat cataataggt 3 00 
gttgcccaaa aataaatttt gatgttctaa aatcataaat tttgcaagat attgtggcaa 360 
ttcaatacct atttgtggcg aaatcgccaa ttttaattca atttcttgta gcataatatt 420 
tcccactcaa atcaactggt taaatataca agataataaa aataaatcaa gatttttgtg 480 
atgacaaaca acaattacaa cacctttttt gcagtctata tgcaaatatt ttaaaaaaat 540 
agtataaatc cgccatataa aatggtataa tctttcatct ttcatctttc atctttcatc 600 
tttcatcttt catctttcat ctttcatctt tcatctttca tctttcatct ttcatctttc 660 
atctttcatc tttcatcttt cacatgaaat gatgaaccga gggaagggag ggaggggcaa 720 
gaatgaagag ggagctgaac gaacgcaaat gataaagtaa tttaattgtt caactaacct 780 
taggagaaaa tatgaacaag atatatcgtc tcaaattcag caaacgcctg aatgctttgg 840 
ttgctgtgtc tgaattggca cggggttgtg accattccac agaaaaaggc agcgaaaaac 900 
ctgctcgcat gaaagtgcgt cacttagcgt taaagccact ttccgctatg ttactatctt 960 
taggtgtaac atctattcca caatctgttt tagcaagcgg caatttaaca tcgaccaaaa 1020 
tgaaatggtg cagtttttac aagaaaacaa gtaataaaac cattatccgc aacagtgttg 1080 
acgctatcat taattggaaa caatttaaca tcgaccaaaa tgaaatggtg cagtttttac 1140 
aagaaaacaa caactccgcc gtattcaacc gtgttacatc taaccaaatc tcccaattaa 1200 
aagggatttt agattctaac ggacaagtct ttttaatcaa cccaaatggt atcacaatag 1260 
gtaaagacgc aattattaac actaatggct ttacggcttc tacgctagac atttctaacg 1320 
aaaacatcaa ggcgcgtaat ttcaccttcg agcaaaccaa agataaagcg ctcgctgaaa 1380 
ttgtgaatca cggtttaatt actgtcggta aagacggcag tgtaaatctt attggtggca 1440 
aagtgaaaaa cgagggtgtg attagcgtaa atggtggcag catttcttta ctcgcagggc 1500 
aaaaaatcac catcagcgat ataataaacc caaccattac ttacagcatt gccgcgcctg 1560 
aaaatgaagc ggtcaatctg ggcgatattt ttgccaaagg cggtaacatt aatgtccgtg 1620 
ctgccactat tcgaaaccaa ggtaaacttt ctgctgattc tgtaagcaaa gataaaagcg 1680 
gcaatattgt tctttccgcc aaagagggtg aagcggaaat tggcggtgta atttccgctc 174 0 
aaaatcagca agctaaaggc ggcaagctga tgataaagtc cgataaagtc acattaaaaa 1800 
caggtgcagt tatcgacctt tcaggtaaag aagggggaga aacttacctt ggcggtgacg 1860 
agcgcggcga aggtaaaaac ggcattcaat tagcaaagaa aacctcttta gaaaaaggct 1920 
caaccatcaa tgtatcaggc aaagaaaaag gcggacgcgc tattgtgtgg ggcgatattg 1980 
cgttaattga cggcaatatt aacgctcaag gtagtggtga tatcgctaaa accggtggtt 2 040 
ttgtggagac atcggggcat tatttatcca ttgacagcaa tgcaattgtt aaaacaaaag 2100 
agtggttgct agaccctgat gatgtaacaa ttgaagccga agaccccctt cgcaataata 2160 
ccggtataaa tgatgaattc ccaacaggca ccggtgaagc aagcgaccct aaaaaaaata 2220 
gcgaactcaa aacaacgcta accaatacaa ctatttcaaa ttatctgaaa aacgcctgga 2280 
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caatgaatat aacggcatca agaaaactta ccgttaatag ctcaatcaac atcggaagca 2340 
actcccactt aattctccat agtaaaggtc agcgtggcgg aggcgttcag attgatggag 2400 
atattacttc taaaggcgga aatttaacca tttattctgg cggatgggtt gatgttcata 2460 
aaaatattac gcttgatcag ggttttttaa atattaccgc cgcttccgta gcttttgaag 2520 
gtggaaataa caaagcacgc gacgcggcaa atgctaaaat tgtcgcccag ggcactgtaa 2580 
ccattacagg agagggaaaa gatttcaggg ctaacaacgt atctttaaac ggaacgggta 2640 
aaggtctgaa tatcatttca tcagtgaata atttaaccca caatcttagt ggcacaatta 2700 
acatatctgg gaatataaca attaaccaaa ctacgagaaa gaacacctcg tattggcaaa 2760 
ccagccatga ttcgcactgg aacgtcagtg ctcttaatct agagacaggc gcaaatttta 2 82 0 
cctttattaa atacatttca agcaatagca aaggcttaac aacacagtat agaagctctg 2880 
caggggtgaa ttttaacggc gtaaatggca acatgtcatt caatctcaaa gaaggagcga 2 94 0 
aagttaattt caaattaaaa ccaaacgaga acatgaacac aagcaaacct ttaccaattc 3 000 
ggtttttagc caatatcaca gccactggtg ggggctctgt tttttttgat atatatgcca 3 060 
accattctgg cagaggggct gagttaaaaa tgagtgaaat taatatctct aacggcgcta 3120 
attttacctt aaattcccat gttcgcggcg atgacgcttt taaaatcaac aaagacttaa 3180 
ccataaatgc aaccaattca aatttcagcc tcagacagac gaaagatgat ttttatgacg 324 0 
ggtacgcacg caatgccatc aattcaacct acaacatatc cattctgggc ggtaatgtca 3300 
cccttggtgg acaaaactca agcagcagca ttacggggaa tattactatc gagaaagcag 3360 
caaatgttac gctagaagcc aataacgccc ctaatcagca aaacataagg gatagagtta 3420 
taaaacttgg cagcttgctc gttaatggga gtttaagttt aactggcgaa aatgcagata 3480 
ttaaaggcaa tctcactatt tcagaaagcg ccacttttaa aggaaagact agagataccc 3540 
taaatatcac cggcaatttt accaataatg gcactgccga aattaatata acacaaggag 3600 
tggtaaaact tggcaatgtt accaatgatg gtgatttaaa cattaccact cacgctaaac 3660 
gcaaccaaag aagcatcatc ggcggagata taatcaacaa aaaaggaagc ttaaatatta 3720 
cagacagtaa taatgatgct gaaatccaaa ttggcggcaa tatctcgcaa aaagaaggca 3780 
acctcacgat ttcttccgat aaaattaata tcaccaaaca gataacaatc aaaaagggta 3840 
ttgatggaga ggactctagt tcagatgcga caagtaatgc caacctaact attaaaacca 3 900 
aagaattgaa attgacagaa gacctaagta tttcaggttt caataaagca gagattacag 3 960 
ccaaagatgg tagagattta actattggca acagtaatga cggtaacagc ggtgccgaag 4 020 
ccaaaacagt aacttttaac aatgttaaag attcaaaaat ctctgctgac ggtcacaatg 4080 
tgacactaaa tagcaaagtg aaaacatcta gcagcaatgg cggacgtgaa agcaatagcg 414 0 
acaacgatac cggcttaact attactgcaa aaaatgtaga agtaaacaaa gatattactt 4200 
ctctcaaaac agtaaatatc accgcgtcgg aaaaggttac caccacagca ggctcgacca 4260 
ttaacgcaac aaatggcaaa gcaagtatta caaccaaaac aggtgatatc agcggtacga 4320 
tttccggtaa cacggtaagt gttagcgcga ctggtgattt aaccactaaa tccggctcaa 4380 
aaattgaagc gaaatcgggt gaggctaatg taacaagtgc aacaggtaca attggcggta 444 0 
caatttccgg taatacggta aatgttacgg caaacgctgg cgatttaaca gttgggaatg 4500 
gcgcagaaat taatgcgaca gaaggagctg caaccttaac cgcaacaggg aataccttga 4560 
ctactgaagc cggttctagc atcacttcaa ctaagggtca ggtagacctc ttggctcaga 4620 
atggtagcat cgcaggaagc attaatgctg ctaatgtgac attaaatact acaggcacct 4680 
taaccaccgt ggcaggctcg gatattaaag caaccagcgg caccttggtt attaacgcaa 4740 
aagatgctaa gctaaatggt gatgcatcag gtgatagtac agaagtgaat gcagtcaacg 4 800 
actggggatt tggtagtgtg actgcggcaa cctcaagcag tgtgaatatc actggggatt 4 860 
taaacacagt aaatgggtta aatatcattt cgaaagatgg tagaaacact gtgcgcttaa 4920 
gaggcaagga aattgaggtg aaatatatcc agccaggtgt agcaagtgta gaagaagtaa 4980 
ttgaagcgaa acgcgtcctt gaaaaagtaa aagatttatc tgatgaagaa agagaaacat 5040 
tagctaaact tggtgtaagt gctgtacgtt ttgttgagcc aaataataca attacagtca 5100 
atacacaaaa tgaatttaca accagaccgt caagtcaagt gataatttct gaaggtaagg 5160 
cgtgtttctc aagtggtaat ggcgcacgag tatgtaccaa tgttgctgac gatggacagc 5220 
cgtagtcagt aattgacaag gtagatttca tcctgcaatg aagtcatttt attttcgtat 5280 
tatttactgt gtgggttaaa gttcagtacg ggctttaccc atcttgtaaa aaattacgga 5340 
gaatacaata aagtattttt aacaggttat tattatgaaa aatataaaaa gcagattaaa 5400 
actcagtgca atatcagtat tgcttggcct ggcttcttca tcattgtatg cagaagaagc 5460 
gtttttagta aaaggctttc agttatctgg tgcacttgaa actttaagtg aagacgccca 5520 
actgtctgta gcaaaatctt tatctaaata ccaaggctcg caaactttaa caaacctaaa 5580 
aacagcacag cttgaattac aggctgtgct agataagatt gagccaaata aatttgatgt 5640 
gatattgccg caacaaacca ttacggatgg caatatcatg tttgagctag tctcgaaatc 5700 
agccgcagaa agccaagttt tttataaggc gagccagggt tatagtgaag aaaatatcgc 5760 
tcgtagcctg ccatctttga aacaaggaaa agtgtatgaa gatggtcgtc agtggttcga 5820 
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tttgcgtgaa tttaatatgg caaaagaaaa cccgcttaag gttacccgtg tacattacga 5880 
actaaaccct aaaaacaaaa cctctaattt gataattgcg ggcttctcgc cttttggtaa 5940 
aacgcgtagc tttatttctt atgataattt cggcgcgaga gagtttaact accaacgtgt 6000 
aagcttgggt tttgttaatg ccaatttaac tggtcatgat gatgtgttaa ttataccagt 6060' 
atgagttatg ctgattctaa tgatatcgac ggcttaccaa gtgcgattaa tcgtaaatta 612 0 
tcaaaaggtc aatctatctc tgcgaatctg aaatggagtt attatctccc aacatttaac 6180 
cttggcatgg aagaccaatt taaaattaat ttaggctaca actaccgcca tattaatcaa 624 0 
acctccgcgt taaatcgctt gggtgaaacg aagaaaaaat ttgcagtatc aggcgtaagt 63 00 
gcaggcattg atggacatat ccaatttacc cctaaaacaa tctttaatat tgatttaact 6360 
catcattatt acgcgagtaa attaccaggc tcttttggaa tggagcgcat tggcgaaaca 642 0 
tttaatcgca gctatcacat tagcacagcc agtttagggt tgagtcaaga gtttgctcaa 64 80 
ggttggcatt ttagcagtca attatcaggt caatttactc tacaagatat tagcagtata 6540 
gatttattct ctgtaacagg tacttatggc gtcagaggct ttaaatacgg cggtgcaagt 6600 
ggtgagcgcg gtcttgtatg gcgtaatgaa ttaagtatgc caaaatacac ccgcttccaa 6660 
atcagccctt atgcgtttta tgatgcaggt cagttccgtt ataatagcga aaatgctaaa 6720 
acttacggcg aagatatgca cacggtatcc tctgcgggtt taggcattaa aacctctcct 6780 
acacaaaact taagcctaga tgcttttgtt gctcgtcgct ttgcaaatgc caatagtgac 6840 
aatttgaatg gcaacaaaaa acgcacaagc tcacctacaa ccttctgggg gagattaaca 6900 
ttcagtttct aaccctgaaa tttaatcaac tggtaagcgt tccgcctacc agtttataac 6960 
tatatgcttt acccgccaat ttacagtcta taggcaaccc tgtttttacc cttatatatc 7020 
aaataaacaa gctaagctga gctaagcaaa ccaagcaaac tcaagcaagc caagtaatac 7080 
taaaaaaaca atttatatga taaactaaag tatactccat gccatggcga tacaagggat 7140 
ttaataatat gacaaaagaa aatttgcaaa acgctcctca agatgcgacc gctttacttg 7200 
cggaattaag caacaatcaa actcccctgc gaatatttaa acaaccacgc aagcccagcc 7260 
tattacgctt ggaacaacat atcgcaaaaa aagattatga gtttgcttgt cgtgaattaa 7320 
tggtgattct ggaaaaaatg gacgctaatt ttggaggcgt tcacgatatt gaatttgacg 7380 
cacccgctca gctggcatat ctacccgaaa aattactaat ttattttgcc actcgtctcg 7440 
ctaatgcaat tacaacactc ttttccgacc ccgaattggc aatttctgaa gaaggggcgt 7500 
taaagatgat tagcctgcaa cgctggttga cgctgatttt tgcctcttcc ccctacgtta 7560 
acgcagacca tattctcaat aaatataata tcaacccaga ttccgaaggt ggctttcatt 7620 
tagcaacaga caactcttct attgctaaat tctgtatttt ttacttaccc gaatccaatg 7680 
tcaatatgag tttagatgcg ttatgggcag ggaatcaaca actttgtgct tcattgtgtt 774 0 
ttgcgttgca gtcttcacgt tttattggta ccgcatctgc gtttcataaa agagcggtgg 7800 
ttttacagtg gtttcctaaa aaactcgccg aaattgctaa tttagatgaa ttgcctgcaa 7860 
atatccttca tgatgtatat atgcactgca gttatgattt agcaaaaaac aagcacgatg 7920 
ttaagcgtcc attaaacgaa cttgtccgca agcatatcct cacgcaagga tggcaagacc 7980 
gctaccttta caccttaggt aaaaaggacg gcaaacctgt gatgatggta ctgcttgaac 8040 
attttaattc gggacattcg atttatcgta cacattcaac ttcaatgatt gctgctcgag 8100 
aaaaattcta tttagtcggc ttaggccatg agggcgttga taaaataggt cgagaagtgt 8160 
ttgacgagtt ctttgaaatc agtagcaata atataatgga gagactgttt tttatccgta 8220 
aacagtgcga aactttccaa cccgcagtgt tctatatgcc aagcattggc atggatatta 8280 
ccacgatttt tgtgagcaac actcggcttg cccctattca agctgtagcc ctgggtcatc 8340 
ctgccactac gcattctgaa tttattgatt atgtcatcgt agaagatgat tatgtgggca 8400 
gtgaagattg tttcagcgaa acccttttac gcttacccaa agatgcccta ccttatgtac 84 60 
cttctgcact cgccccacaa aaagtggatt atgtactcag ggaaaaccct gaagtagtca 8520 
atatcggtat tgccgctacc acaatgaaat taaaccctga atttttgcta acattgcaag 8580 
aaatcagaga taaagctaaa gtcaaaatac attttcattt cgcacttgga caatcaacag 8640 
gcttgacaca cccttatgtc aaatggttta tcgaaagcta tttaggtgac gatgccactg 8700 
cacatcccca cgcaccttat cacgattatc tggcaatatt gcgtgattgc gatatgctac 8760 
taaatccgtt tcctttcggt aatactaacg gcataattga tatggttaca ttaggtttag 8820 
ttggtgtatg caaaacgggg gatgaagtac atgaacatat tgatgaaggt ctgtttaaac 8880 
gcttaggact accagaatgg ctgatagccg acacacgaga aacatatatt gaatgtgctt 8940 
tgcgtctagc agaaaaccat caagaacgcc ttgaactccg tcgttacatc atagaaaaca 9000 
acggcttaca aaagcttttt acaggcgacc ctcgtccatt gggcaaaata ctgcttaaga 9060 
aaacaaatga atggaagcgg aagcacttga gtaaaaaata acggtttttt aaagtaaaag 9120 
tgcggttaat tttcaaagcg ttttaaaaac ctctcaaaaa tcaaccgcac ttttatcttt 9180 
ataacgatcc cgcacgctga cagtttatca gcctcccgcc ataaaactcc gcctttcatg 924 0 
gcggagattt tagccaaaac tggcagaaat taaaggctaa aatcaccaaa ttgcaccaca 9300 
aaatcaccaa tacccacaaa aaa 9323 



<210> 7 
<211> 4794 
<212> DNA 

<213> Haemophilus influenzae 
<400> 7 

atgaacaaga tatatcgtct caaattcagc aaacgcctga atgctttggt tgctgtgtct 60 
gaattgacac ggggttgtga ccattccaca gaaaaaggca gtgaaa'aacc tgttcgtacg 120 
aaagtacgcc acttggcgtt aaagccactt tccgctatat tgctatcttt gggcatggca 180 
tccattccgc aatctgtttt agcgagcggt ttacagggaa tgagcgtcgt acacggtaca 240 
gcaaccatgc aagtagacgg caataaaacc actatccgta atagcgtcaa tgctatcatc 300 
aattggaaac aatttaacat tgaccaaaat gaaatggtgc agtttttaca agaaagcagc 360 
aactctgccg ttttcaaccg tgttacatct gaccaaatct cccaattaaa agggatttta 420 
gattctaacg gacaagtctt tttaatcaac ccaaatggta tcacaatagg taaagacgca 480 
attattaaca ctaatggctt tactgcttct acgctagaca tttctaacga aaacatcaag 540 
gcgcgtaatt tcacccttga gcaaaccaag gataaagcac tcgctgaaat cgtgaatcac 600 
ggtttaatta ccgttggtaa agacggtagc gtaaacctta ttggtggcaa agtgaaaaac 660 
gagggcgtga ttagcgtaaa tggcggtagt atttctttac ttgcagggca aaaaatcacc 720 
atcagcgata taataaatcc aaccatcact tacagcattg ctgcacctga aaacgaagcg 780 
atcaatctgg gcgatatttt tgccaaaggt ggtaacatta atgtccgcgc tgccactatt 840 
cgcaataaag gtaaactttc tgccgactct gtaagcaaag ataaaagtgg taacattgtt 900 
ctctctgcca aagaaggtga agcggaaatt ggcggtgtaa tttccgctca aaatcagcaa 960 
gccaaaggtg gtaagttgat gattacaggc gataaagtta cattgaaaac gggtgcagtt 1020 
atcgaccttt cgggtaaaga agggggagaa acttatcttg gcggtgacga gcgtggcgaa 1080 
ggtaaaaacg gcattcaatt agcaaagaaa accactttag aaaaaggctc aacaattaat 1140 
gtgtcaggta aagaaaaagg tgggcgcgct attgtatggg gcgatattgc gttaattgac 1200 
ggcaatatta atgcccaagg taaagatatc gctaaaactg gtggttttgt ggagacgtcg 1260 
gggcattact tatccattga tgataacgca attgttaaaa caaaagaatg gctactagac 1320 
ccagagaatg tgactattga agctccttcc gcttctcgcg tcgagctggg tgccgatagg 1380 
aattcccact cggcagaggt gataaaagtg accctaaaaa aaaataacac ctccttgaca 1440 
acactaacca atacaaccat ttcaaatctt ctgaaaagtg cccacgtggt gaacataacg 1500 
gcaaggagaa aacttaccgt taatagctct atcagtatag aaagaggctc ccacttaatt 1560 
ctccacagtg aaggtcaggg cggtcaaggt gttcagattg ataaagatat tacttctgaa 1620 
ggcggaaatt taaccattta ttctggcgga tgggttgatg ttcataaaaa tattacgctt 1680 
ggtagcggct ttttaaacat cacaactaaa gaaggagata tcgccttcga agacaagtct 1740 
ggacggaaca acctaaccat tacagcccaa gggaccatca cctcaggtaa tagtaacggc 1800 
tttagattta acaacgtctc tctaaacagc cttggcggaa agctgagctt tactgacagc 1860 
agagaggaca gaggtagaag aactaagggt aatatctcaa acaaatttga cggaacgtta 1920 
aacatttccg gaactgtaga tatctcaatg aaagcaccca aagtcagctg gttttacaga 1980 
gacaaaggac gcacctactg gaacgtaacc actttaaatg ttacctcggg tagtaaattt 2 04 0 
aacctctcca ttgacagcac aggaagtggc tcaacaggtc caagcatacg caatgcagaa 2100 
ttaaatggca taacatttaa taaagccact tttaatatcg cacaaggctc aacagctaac 2160 
tttagcatca aggcatcaat aatgcccttt aagagtaacg ctaactacgc attatttaat 2220 
gaagatattt cagtctcagg ggggggtagc cttaatttca aacttaacgc ctcatctagc 2280 
aacatacaaa cccctggcgt aattataaaa tctcaaaact ttaatgtctc aggagggtca 2340 
actttaaatc tcaaggctga aggttcaaca gaaaccgctt tttcaataga aaatgattta 2400 
aacttaaacg ccaccggtgg caatataaca atcagacaag tcgagggtac cgattcacgc 2460 
gtcaacaaag gtgtcgcagc caaaaaaaac ataactttta aagggggtaa tatcaccttc 2520 
ggctctcaaa aagccacaac agaaatcaaa ggcaatgtta ccatcaataa aaacactaac 2580 
gctactcttt gtggtgcgaa ttttgccgaa aacaaatcgc ctttaaatat agcaggaaat 2640 
gttattaata atggcaacct taccactgcc ggctccatta tcaatatagc cggaaatctt 2700 
actgtttcaa aaggcgctaa ccttcaagct ataacaaatt acacttttaa tgtagccggc 2760 
tcatttgaca acaatggcgc ttcaaacatt tccattgcca gaggaggggc taaatttaaa 2820 
gatatcaata acaccagtag cttaaatatt accaccaact ctgataccac ttaccgcacc 2880 
attataaaag gcaatatatc caacaaatca ggtgatttga atattattga taaaaaaagc 2 94 0 
gacgctgaaa tccaaattgg cggcaatatc tcacaaaaag aaggcaatct cacaatttct 3 000 
tctgataaag taaatattac caatcagata acaatcaaag caggcgttga aggggggcgt 3060 
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tctgattcaa gtgaggcaga aaatgctaac ctaactattc aaaccaaaga gttaaaattg 3120 
gcaggagacc taaatatttc aggctttaat aaagcagaaa ttacagctaa aaatggcagt 3180 
gatttaacta ttggcaatgc tagcggtggt aatgctgatg ctaaaaaagt gacttttgac 3240 
aaggttaaag attcaaaaat ctcgactgac ggtcacaatg taacactaaa tagcgaagtg 3300 
aaaacgtcta atggtagtag caatgctggt aatgataaca gcaccggttt aaccatttcc 3360 
gcaaaagatg taacggtaaa caataacgtt acctcccaca agacaataaa tatctctgcc 34 20 
gcagcaggaa atgtaacaac caaagaaggc acaactatca atgcaaccac aggcagcgtg 34 80 
gaagtaactg ctcaaaatgg tacaattaaa ggcaacatta cctcgcaaaa tgtaacagtg 3540 
acagcaacag aaaatcttgt taccacagag aatgctgtca ttaatgcaac cagcggcaca 3600 
gtaaacatta gtacaaaaac aggggatatt aaaggtggaa ttgaatcaac ttccggtaat 3660 
gtaaatatta cagcgagcgg caatacactt aaggtaagta atatcactgg tcaagatgta 3720 
acagtaacag cggatgcagg agccttgaca actacagcag gctcaaccat tagtgcgaca 3780 
acaggcaatg caaatattac aaccaaaaca ggtgatatca acggtaaagt tgaatccagc 3840 
tccggctctg taacacttgt tgcaactgga gcaactcttg ctgtaggtaa tatttcaggt 3900 
aacactgtta ctattactgc ggatagcggt aaattaacct ccacagtagg ttctacaatt 3960 
aatgggacta atagtgtaac cacctcaagc caatcaggcg atattgaagg tacaatttct 4020 
ggtaatacag taaatgttac agcaagcact ggtgatttaa ctattggaaa tagtgcaaaa 4080 
gttgaagcga aaaatggagc tgcaacctta actgctgaat caggcaaatt aaccacccaa 4140 
acaggctcta gcattacctc aagcaatggt cagacaactc ttacagccaa ggatagcagt 42 00 
atcgcaggaa acattaatgc tgctaatgtg acgttaaata ccacaggcac tttaactact 4260 
acaggggatt caaagattaa cgcaaccagt ggtaccttaa caatcaatgc aaaagatgcc 4320 
aaattagatg gtgctgcatc aggtgaccgc acagtagtaa atgcaactaa cgcaagtggc 43 80 
tctggtaacg tgactgcgaa aacctcaagc agcgtgaata tcaccgggga tttaaacaca 4440 
ataaatgggt taaatatcat ttcggaaaat ggtagaaaca ctgtgcgctt aagaggcaag 4500 
gaaattgatg tgaaatatat ccaaccaggt gtagcaagcg tagaagaggt aattgaagcg 4560 
aaacgcgtcc ttgagaaggt aaaagattta tctgatgaag aaagagaaac actagccaaa 4620 
cttggtgtaa gtgctgtacg tttcgttgag ccaaataatg ccattacggt taatacacaa 4680 
aacgagttta caaccaaacc atcaagtcaa gtgacaattt ctgaaggtaa ggcgtgtttc 4740 
tcaagtggta atggcgcacg agtatgtacc aatgttgctg acgatggaca gcag 47 94 



<210> 8 
<211> 4803 
<212> DNA 

<213> Haemophilus influenzae 

<400> 8 

atgaacaaga tatatcgtct caaattcagc aaacgcctga atgctttggt tgctgtgtct 60 
gaattgacac ggggttgtga ccattccaca gaaaaaggca gtgaaaaacc tgttcgtacg 120 
aaagtacgcc acttggcgtt aaagccactt tccgctatat tgctatcttt gggcatggca 180 
tccattccgc aatctgtttt agcgagcggt ttacagggaa tgagcgtcgt acacggtaca 240 
gcaaccatgc aagtagacgg caataaaacc actatccgta atagcgtcaa tgctatcatc 300 
aattggaaac aatttaacat tgaccaaaat gaaatggtgc agtttttaca agaaagcagc 360 
aactctgccg ttttcaaccg tgttacatct gaccaaatct cccaattaaa agggatttta 420 
gattctaacg gacaagtctt tttaatcaac ccaaatggta tcacaatagg taaagacgca 480 
attattaaca ctaatggctt tactgcttct acgctagaca tttctaacga aaacatcaag 540 
gcgcgtaatt tcacccttga gcaaaccaag gataaagcac tcgctgaaat cgtgaatcac 600 
ggtttaatta ccgttggtaa agacggtagc gtaaacctta ttggtggcaa agtgaaaaac 660 
gagggcgtga ttagcgtaaa tggcggtagt atttctttac ttgcagggca aaaaatcacc 720 
atcagcgata taataaatcc aaccatcact tacagcattg ctgcacctga aaacgaagcg 780 
atcaatctgg gcgatatttt tgccaaaggt ggtaacatta atgtccgcgc tgccactatt 840 
cgcaataaag gtaaactttc tgccgactct gtaagcaaag ataaaagtgg taacattgtt 900 
ctctctgcca aagaaggtga agcggaaatt ggcggtgtaa tttccgctca aaatcagcaa 960 
gccaaaggtg gtaagttgat gattacaggt gataaagtca cattaaaaac aggtgcagtt 1020 
atcgaccttt caggtaaaga agggggagag acttatcttg gcggtgatga gcgtggcgaa 1080 
ggtaaaaatg gtattcaatt agcgaagaaa acctctttag aaaaaggctc gacaattaat 1140 
gtatcaggca aagaaaaagg cgggcgcgct attgtatggg gcgatattgc attaattaat 1200 
ggtaacatta atgctcaagg tagcgatatt gctaaaactg gcggctttgt ggaaacatca 1260 
ggacatgact tatccattgg tgatgatgtg attgttgacg ctaaagagtg gttattagac 1320 
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ccagatgatg tgtccattga aactcttaca 
ggatatacaa caggagatgg gactaaagag 
acattaacaa actcaactct tgagcaaatc 
gctaataata gaatttatgt taatagctcc 
cacactaaac gagatggagt taaaattaac 
ttaaccatta aagcaggctc ttgggttgat 
tttttgaata ttgtcgctgg ggattctgta 
aacgcaacag atgctcaaat taccgcacaa 
caatttagat tcaataatgt atctattaac 
aatcaaaata atttcactca taaatttgat 
attaaccaaa ccacgaaaaa agatgttaaa 
aatgtttctt ctcttacttt gaatacggtg 
agcggctcaa attcccaaga tttgaggtca 
aacggcatcg gaggcaaaac aaacttcaac 
ttaaaaccaa acgccgctac agacccaaaa 
attacagcta ccggtaacag tgatagctct 
tctagagctg ccggcataaa catggattca 
ataacatccc ataatcgcaa tagtaatgct 
gcaactggct cgaattttag tcttaagcaa 
aaacacgcca ttaactcaag tcataatcta 
ggggaaaatt caagcagtag cattacgggc 
acattacaag ctgacaccag caacagcaac 
ggcaatatat ctgttgaggg gaatttaagc 
aatctttcta ttgcagaaga ttccacattt 
accggcacct ttaccaacaa cggtaccgcc 
ctccaaggcg atattatcaa taaaggtggt 
caaaaaacca ttattaacgg aaatataact 
attaaagccg acgccgaaat ccaaattggc 
acaatttctt ctgataaagt aaatattacc 
ggggggcgtt ctgattcaag tgaggcagaa 
ttaaaattgg caggagacct aaatatttca 
aatggcagtg atttaactat tggcaatgct 
acttttgaca aggttaaaga ttcaaaaatc 
agcgaagtga aaacgtctaa tggtagtagc 
accatttccg caaaagatgt aacggtaaac 
atctctgccg cagcaggaaa tgtaacaacc 
ggcagcgtgg aagtaactgc tcaaaatggt 
gtaacagtga cagcaacaga aaatcttgtt 
agcggcacag taaacattag tacaaaaaca 
tccggtaatg taaatattac agcgagcggc 
caagatgtaa cagtaacagc ggatgcagga 
agtgcgacaa caggcaatgc aaatattaca 
gaatccagct ccggctctgt aacacttgtt 
atttcaggta acactgttac tattactgcg 
tctacaatta atgggactaa tagtgtaacc 
acaatttctg gtaatacagt aaatgttaca 
agtgcaaaag ttgaagcgaa aaatggagct 
accacccaaa caggctctag cattacctca 
gatagcagta tcgcaggaaa cattaatgct 
ttaactacta caggggattc aaagattaac 
aaagatgcca aattagatgg tgctgcatca 
gcaagtggct ctggtaacgt gactgcgaaa 
ttaaacacaa taaatgggtt aaatatcatt 
agaggcaagg aaattgatgt gaaatatatc 
attgaagcga aacgcgtcct tgagaaggta 
ctagccaaac ttggtgtaag tgctgtacgt 
aatacacaaa acgagtttac aaccaaacca 
gcgtgtttct caagtggtaa tggcgcacga 
cag 



tctggacgca ataataccgg cgaaaaccaa 1380 
tcacctaaag gtaatagtat ttctaaacct 1440 
ctaagaagag gttcttatgt taatatcact 1500 
atcaacttat ctaatggcag tttaacactt 1560 
ggtgatatta cctcaaacga aaatggtaat 1620 
gttcataaaa acatcacgct tggtacgggt 1680 
gcttttgaga gagagggcga taaagcacgt 174 0 
gggacgataa ccgtcaataa agatgataaa 1800 
gggacgggca agggtttaaa gtttattgca 1860 
ggcgaaatta acatatctgg aatagtaaca 1920 
tactggaatg catcaaaaga ctcttactgg 1980 
caaaaattta cctttataaa attcgttgat 2040 
tcacgtagaa gttttgcagg cgtacatttt 2100 
atcggagcta acgcaaaagc cttatttaaa 2160 
aaagaattac ctattacttt taacgccaac 2220 
gtgatgtttg acatacacgc caatcttacc 2280 
attaacatta ccggcgggct tgacttttcc 2340 
tttgaaatca aaaaagactt aactataaat 2400 
acgaaagatt ctttttataa tgaatacagc 2460 
accattcttg gcggcaatgt cactctaggt 2520 
aatatcaata tcaccaataa agcaaatgtt 2580 
acaggcttga agaaaagaac tctaactctt 2640 
ctaactggtg caaatgcaaa cattgtcggc 2700 
aaaggagaag ccagtgacaa cctaaacatc 2760 
aacattaata taaaacaagg agtggtaaaa 282 0 
ttaaatatca ctactaacgc ctcaggcact 2880 
aacgaaaaag gcgacttaaa catcaagaat 294 0 
ggcaatatct cacaaaaaga aggcaatctc 3 000 
aatcagataa caatcaaagc aggcgttgaa 3 060 
aatgctaacc taactattca aaccaaagag 312 0 
ggctttaata aagcagaaat tacagctaaa 3180 
agcggtggta atgctgatgc taaaaaagtg 3240, 
tcgactgacg gtcacaatgt aacactaaat 3300 
aatgctggta atgataacag caccggttta 3360 
aataacgtta cctcccacaa gacaataaat 3420 
aaagaaggca caactatcaa tgcaaccaca 3480 
acaattaaag gcaacattac ctcgcaaaat 3540 
accacagaga atgctgtcat taatgcaacc 3600 
ggggatatta aaggtggaat tgaatcaact 3660 
aatacactta aggtaagtaa tatcactggt 3720 
gccttgacaa ctacagcagg ctcaaccatt 3780 
accaaaacag gtgatatcaa cggtaaagtt 3 84 0 
gcaactggag caactcttgc tgtaggtaat 3 900 
gatagcggta aattaacctc cacagtaggt 3 960 
acctcaagcc aatcaggcga tattgaaggt 4 02 0 
gcaagcactg gtgatttaac tattggaaat 4080 
gcaaccttaa ctgctgaatc aggcaaatta 4140 
agcaatggtc agacaactct tacagccaag 4200 
gctaatgtga cgttaaatac cacaggcact 4260 
gcaaccagtg gtaccttaac aatcaatgca 4320 
ggtgaccgca cagtagtaaa tgcaactaac 4380 
acctcaagca gcgtgaatat caccggggat 444 0 
tcggaaaatg gtagaaacac tgtgcgctta 4500 
caaccaggtg tagcaagcgt agaagaggta 4560 
aaagatttat ctgatgaaga aagagaaaca 462 0 
ttcgttgagc caaataatgc cattacggtt 4680 
tcaagtcaag tgacaatttc tgaaggtaag 474 0 
gtatgtacca atgttgctga cgatggacag 4800 
4803 
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<210> 9 
<211> 1599 
<212> PRT 

<213> Haemophilus influenzae 
<400> 9 

Met Asn Lys lie Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 
15 10 15 

Val Ala Val Ser Glu Leu Thr Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

Gly Ser Glu Lys Pro Val Arg Thr Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala lie Leu Leu Ser Leu Gly Met Ala Ser lie Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Ser Val. Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr Thr He Arg Asn Ser Val 
85 90 95 

Asn Ala He He Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 110 

Glu Gin Phe Leu Gin Glu Ser Ser Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 

Thr Ser Asp Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 
130 135 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
145 150 155 160 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 
165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Leu Glu Gin Thr Lys Asp Lys 
180 185 190 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 240 

He Ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 



245 



250 



255 



Glu Asn Glu Ala He Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 
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He Asn Val Arg Ala Ala Thr He Arg Asn Lys Gly Lys Leu Ser Ala 
275 280 285 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
290 295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 
340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Thr Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Arg Ala He Val Trp Gly Asp He Ala Leu He Asp 
385 390 395 400 

Gly Asn He Asn Ala Gin Gly Lys Asp He Ala Lys Thr Gly Gly Phe 
405 410 415 

Val Glu Thr Ser Gly His Tyr Leu Ser He Asp Asp Asn Ala He Val 
420 425 430 

Lys Thr Lys Glu Trp Leu Leu Asp Pro Glu Asn Val Thr He Glu Ala 
435 440 445 

Pro Ser Ala Ser Arg Val Glu Leu Gly Ala Asp Arg Asn Ser His Ser 
450 455 460 

Ala Glu Val He Lys Val Thr Leu Lys Lys Asn Asn Thr Ser Leu Thr 
465 470 475 480 

Thr Leu Thr Asn Thr Thr He Ser Asn Leu Leu Lys Ser Ala His Val 
485 490 495 

Val Asn He Thr Ala Arg Arg Lys Leu Thr Val Asn Ser Ser He Ser 
500 505 510 

He Glu Arg Gly Ser His Leu He Leu His Ser Glu Gly Gin Gly Gly 
515 520 525 

Gin Gly Val Gin He Asp Lys Asp He Thr Ser Glu Gly Gly Asn Leu 
530 535 540 

Thr He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn He Thr Leu 
545 550 555 560 

Gly Ser Gly Phe Leu Asn He Thr Thr Lys Glu Gly Asp He Ala Phe 



565 



570 



575 



Glu Asp Lys Ser Gly Arg Asn Asn Leu Thr He Thr Ala Gin Gly Thr 
580 585 590 



He Thr Ser Gly Asn Ser Asn Gly Phe Arg Phe Asn Asn Val Ser Leu 
595 600 605 

Asn Ser Leu Gly Gly Lys Leu Ser Phe Thr Asp Ser Arg Glu Asp Arg 
610 615 620 

Gly Arg Arg Thr Lys Gly Asn He Ser Asn Lys Phe Asp Gly Thr Leu 
625 630 635 640 

Asn He Ser Gly Thr Val Asp He Ser Met Lys Ala Pro Lys Val Ser 
645 650 655 

Trp Phe Tyr Arg Asp Lys Gly Arg Thr Tyr Trp Asn Val Thr Thr Leu 
660 665 670 

Asn Val Thr Ser Gly Ser Lys Phe Asn Leu Ser He Asp Ser Thr Gly 
675 680 685 

Ser Gly Ser Thr Gly Pro Ser He Arg Asn Ala Glu Leu Asn Gly He 
690 695 700 

Thr Phe Asn Lys Ala Thr Phe Asn He Ala Gin Gly Ser Thr Ala Asn 
705 710 715 720 

Phe Ser He Lys Ala Ser He Met Pro Phe Lys Ser Asn Ala Asn Tyr 
725 730 735 

Ala Leu Phe Asn Glu Asp He Ser Val Ser Gly Gly Gly Ser Val Asn 
740 745 750 

Phe Lys Leu Asn Ala Ser Ser Ser Asn He Gin Thr Pro Gly Val He 
755 760 765 

He Lys Ser Gin Asn Phe Asn Val Ser Gly Gly Ser Thr Leu Asn Leu 
770 775 780 

Lys Ala Glu Gly Ser Thr Glu Thr Ala Phe Ser He Glu Asn Asp Leu 
785 790 795 800 

Asn Leu Asn Ala Thr Gly Gly Asn He Thr He Arg Gin Val Glu Gly 
805 810 815 

Thr Asp Ser Arg Val Asn Lys Gly Val Ala Ala Lys Lys Asn He Thr 
820 825 830 

Phe Lys Gly Gly Asn He Thr Phe Gly Ser Gin Lys Ala Thr Thr Glu 
835 840 845 

He Lys Gly Asn Val Thr He Asn Lys Asn Thr Asn Ala Thr Leu Arg 
850 855 860 

Gly Ala Asn Phe Ala Glu Asn Lys Ser Pro Leu Asn He Ala Gly Asn 
865 870 875 880 



Val He Asn Asn Gly Asn Leu Thr Thr Ala Gly Ser He He Asn He 
885 890 895 



Ala Gly Asn Leu Thr Val Ser Lys Gly Ala Asn Leu Gin Ala He Thr 
900 905 910 

Asn Tyr Thr Phe Asn Val Ala Gly Ser Phe Asp Asn Asn Gly Ala Ser 
915 920 925 

Asn He Ser He Ala Arg Gly Gly Ala Lys Phe Lys Asp He Asn Asn 
930 935 940 

Thr Ser Ser Leu Asn He Thr Thr Asn Ser Asp Thr Thr Tyr Arg Thr 
945 950 955 960 

He He Lys Gly Asn He Ser Asn Lys Ser Gly Asp Leu Asn He He 
965 970 975 

Asp Lys Lys Ser Asp Ala Glu He Gin He Gly Gly Asn He Ser Gin 
980 985 990 

Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys Val Asn He Thr Asn 
995 1000 1005 

Gin He Thr He Lys Ala Gly Val Glu Gly Gly Arg Ser Asp Ser Ser 
1010 1015 1020 

Glu Ala Glu Asn Ala Asn Leu Thr He Gin Thr Lys Glu Leu Lys Leu 
1025 1030 1035 1040 

Ala Gly Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr Ala 
1045 1050 1055 

Lys Asn Gly Ser Asp Leu Thr He Gly Asn Ala Ser Gly Gly Asn Ala 
1060 1065 1070 

Asp Ala Lys Lys Val Thr Phe Asp Lys Val Lys Asp Ser Lys He Ser 
1075 1080 1085 

Thr Asp Gly His Asn Val Thr Leu Asn Ser Glu Val Lys Thr Ser Asn 
1090 1095 1100 

Gly Ser Ser Asn Ala Gly Asn Asp Asn Ser Thr Gly Leu Thr He Ser 
1105 1110 1115 1120 

Ala Lys Asp Val Thr Val Asn Asn Asn Val Thr Ser His Lys Thr He 
1125 • 1130 1135 

Asn He Ser Ala Ala Ala Gly Asn Val Thr Thr Lys Glu Gly Thr Thr 
1140 1145 1150 

He Asn Ala Thr Thr Gly Ser Val Glu Val Thr Ala Gin Asn Gly Thr 
1155 1160 1165 

He Lys Gly Asn He Thr Ser Gin Asn Val Thr Val Thr Ala Thr Glu 
1170 1175 1180 

Asn Leu Val Thr Thr Glu Asn Ala Val He Asn Ala Thr Ser Gly Thr 
1185 1190 1195 1200 

Val Asn He Ser Thr Lys Thr Gly Asp He Lys Gly Gly He Glu Ser 
1205 1210 1215 
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Thr Ser Gly Asn Val Asn He Thr Ala Ser Gly Asn Thr Leu Lys Val 
1220 1225 1230 

Ser Asn He Thr Gly Gin Asp Val Thr Val Thr Ala Asp Ala Gly Ala 
1235 1240 1245 

Leu Thr Thr Thr Ala Gly Ser Thr He Ser Ala Thr Thr Gly Asn Ala 
1250 1255 1260 

Asn He Thr Thr Lys Thr Gly Asp He Asn Gly Lys Val Glu Ser Ser 
1265 1270 1275 1280 

Ser Gly Ser Val Thr Leu Val Ala Thr Gly Ala Thr Leu Ala Val Gly 
1285 1290 1295 

Asn He Ser Gly Asn Thr Val Thr He Thr Ala Asp Ser Gly Lys Leu 
1300 1305 1310 

Thr Ser Thr Val Gly Ser Thr He Asn Gly Thr Asn Ser Val Thr Thr 
1315 1320 1325 

Ser Ser Gin Ser Gly Asp He Glu Gly Thr He Ser Gly Asn Thr Val 
1330 1335 1340 

Asn val Thr Ala Ser Thr Gly Asp Leu Thr He Gly Asn Ser Ala Lys 
1345 1350 1355 1360 

Val Glu Ala Lys Asn Gly Ala Ala Thr Leu Thr Ala Glu Ser Gly Lys 
1365 1370 1375 

Leu Thr Thr Gin Thr Gly Ser Ser He Thr Ser Ser Asn Gly Gin Thr 
1380 1385 1390 

Thr Leu Thr Ala Lys Asp Ser Ser He Ala Gly Asn He Asn Ala Ala 
1395 1400 1405 

Asn Val Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Thr Gly Asp Ser 
1410 1415 1420 

Lys He Asn Ala Thr Ser Gly Thr Leu Thr He Asn Ala Lys Asp Ala 
1425 1430 1435 1440 

Lys Leu Asp Gly Ala Ala Ser Gly Asp Arg Thr Val Val Asn Ala Thr 
1445 1450 1455 

Asn Ala Ser Gly Ser Gly Asn Val Thr Ala Lys Thr Ser Ser Ser Val 
1460 1465 1470 

Asn He Thr Gly Asp Leu Asn Thr He Asn Gly Leu Asn He He Ser 
1475 1480 1485 

Glu Asn Gly Arg Asn Thr Val Arg Leu Arg Gly Lys Glu He Asp Val 
1490 1495 1500 

Lys Tyr He Gin Pro Gly Val Ala Ser Val Glu Glu Val He Glu Ala 



1505 



1510 



1515 



1520 
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Lys Arg Val Leu Glu Lys Val Lys Asp Leu Ser Asp Glu Glu Arg Glu 
1525 1530 1535 

Thr Leu Ala Lys Leu Gly Val Ser Ala Val Arg Phe Val Glu Pro Asn 
1540 1545 1550 

Asn Ala He Thr Val Asn Thr Gin Asn Glu Phe Thr Thr Lys Pro Ser 
1555 1560 1565 

Ser Gin Val Thr He Ser Glu Gly Lys Ala Cys Phe Ser Ser Gly Asn 
1570 1575 1580 

Gly Ala Arg Val Cys Thr Asn Val Ala Asp Asp Gly Gin Gin Pro 



<210> 10 
<211> 1600 
<212> PRT 

<213> Haemophilus influenzae 
<400> 10 

Met Asn Lys He Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 
15 10 15 

Val Ala Val Ser Glu Leu Thr Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

Gly Ser Glu Lys Pro Val Arg Thr Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala He Leu Leu Ser Leu Gly Met Ala Ser He Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Ser Val Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr Thr He Arg Asn Ser Val 
85 90 95 

Asn Ala He He Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 110 

Glu Gin Phe Leu Gin Glu Ser Ser Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 

Thr Ser Asp Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 
130 135 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
145 150 155 160 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 
165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Leu Glu Gin Thr Lys Asp Lys 



1585 



1590 



1595 



180 



185 



190 
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Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 240 

He Ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala He Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Lys Gly Lys Leu Ser Ala 
275 280 285 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
290 295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 
340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Thr Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Arg Ala He Val Trp Gly Asp He Ala Leu He Asp 
385 390 395 400 

Gly Asn He Asn Ala Gin Gly Ser Asp He Ala Lys Thr Gly Gly Phe 
405 410 415 

Val Glu Thr Ser Gly His Asp Leu Ser He Gly Asp Asp Val He Val 
420 425 430 

Asp Ala Lys Glu Trp Leu Leu Asp Pro Asp Asp Val Ser He Glu Thr 
435 440 445 

Leu Thr Ser Gly Arg Asn Asn Thr Gly Glu Asn Gin Gly Tyr Thr Thr 
450 455 460 

Gly Asp Gly Thr Lys Glu Ser Pro Lys Gly Asn Ser He Ser Lys Pro 
465 470 475 480 

Thr Leu Thr Asn Ser Thr Leu Glu Gin He Leu Arg Arg Gly Ser Tyr 



485 



490 



495 



Val Asn lie Thr Ala Asn Asn Arg He Tyr Val Asn Ser Ser He Asn 
500 505 510 
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Leu Ser Asn Gly Ser Leu Thr Leu His Thr Lys Arg Asp Gly Val Lys 
515 520 525 

He Asn Gly Asp He Thr Ser Asn Glu Asn Gly Asn Leu Thr He Lys 
530 535 540 

Ala Gly Ser Trp Val Asp Val His Lys Asn He Thr Leu Gly Thr Gly 
545 550 555 560 

Phe Leu Asn He Val Ala Gly Asp Ser Val Ala Phe Glu Arg Glu Gly 
565 570 575 

Asp Lys Ala Arg Asn Ala Thr Asp Ala Gin He Thr Ala Gin Gly Thr 
580 585 590 

He Thr Val Asn Lys Asp Asp Lys Gin Phe Arg Phe Asn Asn Val Ser 
595 600 605 

Leu Asn Gly Thr Gly Lys Gly Leu Lys Phe He Ala Asn Gin Asn Asn 
610 615 620 

Phe Thr His Lys Phe Asp Gly Glu He Asn He Ser Gly He Val Thr 
625 630 635- 640 

He Asn Gin Thr Thr Lys Lys Asp Val Lys Tyr Trp Asn Ala Ser Lys 
645 650 655 

Asp Ser Tyr Trp Asn Val Ser Ser Leu- Thr Leu Asn Thr Val Gin Lys 
660 665 670 

Phe Thr Phe He Lys Phe Val Asp Ser Gly Ser Asn Gly Gin Asp Leu 
675 680 685 

Arg Ser Ser Arg Arg Ser Phe Ala Gly Val His Phe Asn Gly He Gly 
690 695 700 

Gly Lys Thr Asn Phe Asn He Gly Ala Asn Ala Lys Ala Leu Phe Lys 
705 710 715 720 

Leu Lys Pro Asn Ala Ala Thr Asp Pro Lys Lys Glu Leu Pro He Thr 
725 730 735 

Phe Asn Ala Asn He Thr Ala Thr Gly Asn Ser Asp Ser Ser Val Met 
740 745 750 

Phe Asp He His Ala Asn Leu Thr Ser Arg Ala Ala Gly He Asn Met 
755 760 765 

Asp Ser He Asn He Thr Gly Gly Leu Asp Phe Ser He Thr Ser His 
770 775 780. 

Asn Arg Asn Ser Asn Ala Phe Glu He Lys Lys Asp Leu Thr He Asn 
785 790 795 800 

Ala Thr Gly Ser Asn Phe Ser Leu Lys Gin Thr Lys Asp Ser Phe Tyr 



805 



810 



815 
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Asn Glu Tyr Ser Lys His Ala lie Asn Ser Ser His Asn Leu Thr lie 
820 825 830 

Leu Gly Gly Asn Val Thr Leu Gly Gly Glu Asn Ser Ser Ser Ser He 
835 840 845 

Thr Gly Asn He Asn He Thr Asn Lys Ala Asn Val Thr Leu Gin Ala 
850 855 860 

Asp Thr Ser Asn Ser Asn Thr Gly Leu Lys Lys Arg Thr Leu Thr Leu 
865 870 875 880 

Gly Asn He Ser Val Glu Gly Asn Leu Ser Leu Thr Gly Ala Asn Ala 
885 890 895 

Asn He Val Gly Asn Leu Ser He Ala Glu Asp Ser Thr Phe Lys Gly 
900 905 910 

Glu Ala Ser Asp Asn Leu Asn He Thr Gly Thr Phe Thr Asn Asn Gly 
915 920 925 

Thr Ala Asn He Asn He Lys Gly Val Val Lys Leu Gly Asp He Asn 
930 935 940 

Asn Lys Gly Gly Leu Asn He Thr Thr Asn Ala Ser Gly Thr Gin Lys 
945 950 955 960 

Thr He He Asn Gly Asn He Thr Asn Glu Lys Gly Asp Leu Asn He 
965 970 975 

Lys Asn He Lys Ala Asp Ala Glu He Gin He Gly Gly Asn He Ser 
980 985 990 

Gin Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys Val Asn He Thr 
995 1000 1005 

Asn Gin He Thr He Lys Ala Gly Val Glu Gly Gly Arg Ser Asp Ser 
1010 1015 1020 

Ser Glu Ala Glu Asn Ala Asn Leu Thr He Gin Thr Lys Glu Leu Lys 
1025 1030 1035 1040 

Leu Ala Gly Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr 
1045 1050 1055 

Ala Lys Asn Gly Ser Asp Leu Thr He Gly Asn Ala Ser Gly Gly Asn 
1060 1065 1070 

Ala Asp Ala Lys Lys Val Thr Phe Asp Lys Val Lys Asp Ser Lys He 
1075 1080 1085 

Ser Thr Asp Gly His Asn Val Thr Leu Asn Ser Glu Val Lys Thr Ser 
1090 1095 1100 

Asn Gly Ser Ser Asn Ala Gly Asn Asp Asn Ser Thr Gly Leu Thr He 
1105 1110 1115 1120 

Ser Ala Lys Asp Val Thr Val Asn Asn Asn Val Thr Ser His Lys Thr 



1125 



1130 



1135 
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He Asn He Ser Ala Ala Ala Gly Asn Val Thr Thr Lys Glu Gly Thr 
1140 1145 1150 

Thr He Asn Ala Thr Thr Gly Ser Val Glu Val Thr Ala Gin Asn Gly 
1155 1160 1165 

Thr He Lys Gly Asn He Thr Ser Gin Asn Val Thr Val Thr Ala Thr 
1170 1175 1180 

Glu Asn Leu Val Thr Thr Glu Asn Ala Val He Asn Ala Thr Ser Gly 
1185 1190 1195 1200 

Thr Val Asn He Ser Thr Lys Thr Gly Asp He Lys Gly Gly He Glu 
1205 1210 1215 

Ser Thr Ser Gly Asn Val Asn He Thr Ala Ser Gly Asn Thr Leu Lys 
1220 1225 1230 

Val Ser Asn He Thr Gly Gin Asp Val Thr Val Thr Ala Asp Ala Gly 
1235 1240 1245 

Ala Leu Thr Thr Thr Ala Gly Ser Thr He Ser Ala Thr Thr Gly Asn 
1250 1255 1260 

Ala Asn He Thr Thr Lys Thr Gly Asp He Asn Gly Lys Val Glu Ser 
1265 1270 1275 1280 

Ser Ser Gly Ser Val Thr Leu Val Ala Thr Gly Ala Thr Leu Ala Val 
1285 • 1290 1295 

Gly Asn He Ser Gly Asn Thr Val Thr He Thr Ala Asp Ser Gly Lys 
1300 1305 1310 

Leu Thr Ser Thr Val Gly Ser Thr He Asn Gly Thr Asn Ser Val Thr 
1315 1320 1325 

Thr Ser Ser Gin Ser Gly Asp He Glu Gly Thr He Ser Gly Asn Thr 
1330 1335 1340 

Val Asn Val Thr Ala Ser Thr Gly Asp Leu Thr He Gly Asn Ser Ala 
1345 1350 1355 1360 

Lys Val Glu Ala Lys Asn Gly Ala Ala Thr Leu Thr Ala Glu Ser Gly 
1365 1370 1375 

Lys Leu Thr Thr Gin Thr Gly Ser Ser He Thr Ser Ser Asn Gly Gin 
1380 1385 1390 

Thr Thr Leu Thr Ala Lys Asp Ser Ser He Ala Gly Asn He Asn Ala 
1395 1400 1405 

Ala Asn Val Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Thr Gly Asp 
1410 1415 1420 



Ser Lys He Asn Ala Thr Ser Gly Thr Leu Thr He Asn Ala Lys Asp 
1425 1430 1435 1440 
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Ala Lys Leu Asp Gly Ala Ala Ser Gly Asp Arg Thr Val Val Asn Ala 
1445 1450 1455 

Thr Asn Ala Ser Gly Ser Gly Asn Val Thr Ala Lys Thr Ser Ser Ser 
1460 1465 1470 

Val Asn lie Thr Gly Asp Leu Asn Thr lie Asn Gly Leu Asn lie lie 
1475 1480 1485 

Ser Glu Asn Gly Arg Asn Thr Val Arg Leu Arg Gly Lys Glu lie Asp 
1490 1495 1500 

Val Lys Tyr He Gin Pro Gly Val Ala Ser Val Glu Glu Val He Glu 
1505 1510 1515 1520 

Ala Lys Arg Val Leu Glu Lys Val Lys Asp Leu Ser Asp Glu Glu Arg 
1525 1530 1535 

Glu Thr Leu Ala Lys Leu Gly Val Ser Ala Val Arg Phe Val Glu Pro 
1540 1545 1550 

Asn Asn Ala He Thr Val Asn Thr Gin Asn Glu Phe Thr Thr Lys Pro 
1555 1560 1565 

Ser Ser Gin Val Thr He Ser Glu Gly Lys Ala Cys Phe Ser Ser Gly 
1570 1575 1580 

Asn Gly Ala Arg Val Cys Thr Asn Val Ala Asp Asp Gly Gin Gin Pro 
1585 1590 1595 1600 



<210> 11 

<211> 29 

<212> PRT 

<213> Haemophilus influenzae 



Val Asp Glu Val He Glu Ala Lys Arg He Leu Glu Lys Val Lys Asp 



<400> 11 



1 



5 



10 



15 



Leu Ser Asp Glu Glu Arg Glu Ala Leu Ala Lys Leu Gly 
20 25 



